9 Experts believe these new findings will help us better understand how several diseases develop and behave, which may lead to more effective and targeted treatments.
The human genome has twice as many genes than previously thought. Many of the previously unknown genes play a role in cellular control, which may impact on the development of human diseases.
The international consortium of scientists analyzed and experimented on sequencing data from 140 different types of human cells and identified thousands of DNA regions that help direct our genes' activities.
Manolis Kellis, an associate professor of computer science at MIT, and co-author of a report that has appeared in several scientific journals, including Nature, said:
"Humans are 99.9 percent identical to each other, and you only have one difference in every 300 to 1,000 nucleotides. What ENCODE allows you to do is provide an annotation of what each nucleotide of the genome does, so that when it's mutated, we can make some predictions about the consequences of the mutation."
What is ENCODE?ENCODE (Encyclopedia of DNA Elements) is a public research conglomerate launched by the NHGRI (National Human Genome Research Institute) in the USA, and the EMBL-European Bioinformatics Institute (EMBL-EBI) in the UK, in 2003. Its aim is to identify all the functional components of the human genome. The NHGRI and EMBL-EBI pledged to release all project data immediately into public databases.
|What scientists initially thought was junk DNA is actually biochemically active.|
A pilot project was published in 2007, which had looked at 1% of the human genome.
To map these epigenomes (modifications), the researchers gathered data from various cell types. While some laboratories measured DNA histone modifications, others gauged the accessibility of various DNA stretches by cutting them into fragments with enzymes.
ENCODE is a collaboration of 442 scientists from 32 laboratories in Japan, Singapore, Spain, the USA and UK. Together, they generated and examined over 15 terabytes of raw data - all this data is now publicly available. They have used approximately 300 years' worth of computer time, focusing on 147 tissue types to find out what turns certain genes on and off, and the specific characteristics of switches in different cell types.
The ENCODE scientists found that about 80% of the human genome is involved in some type of biochemical event, such as protein binding, specifically binding to proteins that impact on how neighboring genes are used. They also found that the very same regulatory regions have different roles to play, depending on what kind of cell they are acting in.
The scientists analyzed the conservation of the A, T, C and G nucleotides in the new regulatory regions they had identified. Nucleotides are conserved if they stay the same over long periods during our evolution. This can be examined by either comparing how variations occur between different species, or among individuals of the same species.
In an online communiqué, the European Bioinformatics Institute, wrote:
"On 5 September, an international team of researchers reveal that much of what has been called 'junk DNA' in the human genome is actually a massive control panel with millions of switches regulating the activity of our genes. Without these switches, genes would not work - and mutations in these regions might lead to human disease."
Three Billion Pairs of Genetic CodeSo far, all three billion pairs of genetic code that make up human DNA have been analyzed by ENCODE. Scientists at the European Bioinformatics Institute explained that they have identified the genome function of 4 million gene switches, which will help researchers hone-in on specific areas of human disease, and hopefully find ways to better treat or cure them. They added that the switches are frequently a long way along the genome from the gene they regulate.
Ewan Birney of the European Bioinformatics Institute, lead analysis coordinator for ENCODE, said: "Our genome is simply alive with switches: millions of places that determine whether a gene is switched on or off. The Human Genome Project showed that only 2% of the genome contains genes, the instructions to make proteins. With ENCODE, we can see that around 80% of the genome is actively doing something. We found that a much bigger part of the genome - a surprising amount, in fact - is involved in controlling when and where proteins are produced, than in simply manufacturing the building blocks."
Ian Dunham, also of European Bioinformatics Institute, said that ENCODE is a useful research tool for any researcher looking into human diseases. Scientists investigating diseases often have a good idea about which genes are involved, but need data on which switches play a role. In some cases the locations of these switches are not where they expected them to be. Dunham said "ENCODE gives us a set of very valuable leads to follow to discover key mechanisms at play in health and disease. Those can be exploited to create entirely new medicines, or to repurpose existing treatments."
A principal investigator on ENCODE, Dr Michael Snyder, professor and chair at Stanford University, explained that ENCODE provides us with the knowledge required so that we can look beyond the genome's linear structure to how the whole network is connected. Genome-wide association studies are helping us understand where certain genes are located, as well as which sequences control them. Snyder said "Because of the complex, three-dimensional shape of our genome, those controls are sometimes far from the gene they regulate and looping around to make contact. Were it not for ENCODE, we might never have looked in those regions. This is a major step toward understanding the wiring diagram of a human being. ENCODE helps us look deeply into the regulatory circuit that tells us how all of the parts come together to make a complex being."
Before, generating and storing enormous volumes of data was a problem in biomedical research. However, as productivity of genome sequencing has improved and become more economical, the focus has moved to analysis, i.e. interpreting data generated from genome-wide association studies. Cambridge University scientists said "ENCODE partners have been working systematically through the human genome, using the same computational and wet-lab methods and reagents in laboratories distributed throughout the world."
Ewan Birney said:
"Getting the best people with the best expertise together is what this is all about. ENCODE has really shown that leading life scientists are very good at collaborating closely on a large scale to produce excellent foundational resources that the whole community can use."
The scientists emphasized that it will be several years before doctors and patients see any tangible benefits from ENCODE.