How are reference genomes assembled

New era in genetic research: Scientists find 64 new reference genomes of mankind

The DNA in our cells determines our appearance, our health and partly also our mental abilities and our being. This genetic code was deciphered 20 years ago by the Human Genome Project. Now researchers have gained a new, better picture of the genetic diversity of our species - responsible for this are 64 new reference genomes, which also include structural variants that could not be found with the technology of the time. Among other things, these reference genomes provide information about which parts of the genes a person has inherited from their parents.

Common methods for genetic analysis are reaching their limits

It has now been 20 years since researchers as part of the Human Genome Project were able to read the genetic code of mankind. At that time, the scientists took several DNA samples and created a first, incomplete reference genome for humans. Since that success, countless other genomes have been sequenced.

However, with the usual methods of DNA analysis, short fragments, only about a hundred base pairs long, are read out and then reassembled using a reference genome. In this way, repeated base sequences or major changes can hardly be detected. In the recent past, however, new sequencing methods have been developed that do not have these weaknesses.

The first human genome sequence was a big step forward, but it was incomplete. In addition to the variation of individual bases, we now know that structural variants also contribute significantly to the genomic differences between individuals"Said Charles Lee of the Jackson Laboratory for Genomic Medicine. " These variants affect how genes function and can contribute to disease, differences in drug response, and more. Knowing how they differ in individuals and in different populations is necessary in order to implement more effective genomic medicine“, Explains Lee's colleague Qihui Zhu.

New technologies lead to new knowledge

An international research consortium led by Peter Ebert from Heinrich Heine University in Düsseldorf has now made an important contribution to this knowledge. With their new sequencing method, the researchers were able to create new, more precise reference genomes for humans. They obtained the necessary DNA from 32 people from different parts of our planet who belong to a total of 25 population groups.

The researchers subjected the genome to a long-read genome analysis and at the same time sequenced the paternal and maternal parts of the genome separately. In every cell a person carries 23 pairs of chromosomes. In each of these pairs there is one chromosome from the father and one from the mother.

For every human individual who participated in the study, we identified not one but two genomes - one for each set of chromosomes. So far we have not been able to distinguish whether the genetic variation originates from one or the other set of chromosomes. We have now been able to solve this thanks to the progress made by the consortium“, Says Jan Korbel from the European Molecular Biology Laboratory (EMBL).

64 new reference genomes reveal unknown knowledge

In this way, the researchers obtained 64 new reference genomes that allow a more comprehensive view of the genetic differences between the various human populations. In addition, genetic differences between individuals and also the gene proportions within a person can be observed. In initial comparisons of the reference genomes, around 107,500 structural variants were found, 68 percent of which were still unknown.

The reference genomes apparently also confirm that the cradle of mankind lies in Africa. There the genomes are most similar in terms of their structural variants, so that an origin from a common original gene pool can be assumed. " Our results clearly show that the genome from Africa contains the deepest reservoir of as yet unexplored genetic structural variants“, Said Ebert and his colleagues.

In most other populations, the genomes tell a story of changes and intermingling. The combinations of structural variants in the case of Afro-Americans were particularly diverse, which the researchers attribute to the transatlantic slave trade and migration in the colonial era.

New era in genetic research?

The research and analysis of these reference genomes has only just begun. However, the researchers are already assuming that a new era in genome research will begin. " These genomes will pave the way for a new wave of scientific discoveries about the biology of the human genome and the relationship between genetic variation and disease“, Says Bernardo Rodriguez-Martin from EMBL.

By understanding the differences found, the researchers hope that there will be significant improvements in the ability to make genetic discoveries related to health and disease.

via Heinrich Heine University Düsseldorf