- Scientists say they finally have sequenced the complete human genome.
- This includes much of the missing 8 percent of the first “draft” of the genome.
- Two competing starter technologies helped fuel the newly sequenced portions.
Twenty-one years ago, researchers announced the first “project” for the sequencing of the entire human genome. It was a monumental achievement, but still about 8% of the genome was missing from the sequence. Now scientists working together around the world say they’ve finally filled that 8% recluse.
➡ You think science is a badass. U.S. too. Let’s discuss it together.
If their work stands up to peer review and it turns out that they did sequencing and assembling the entire human genome, the loopholes and all, that could change the future of medicine.
What’s in a genome?
The sequencing of the human genome has long been a huge project with laudable goals. Why? Because as humans understand their genetic code better, they can make better, more personalized drugs, for example, including the type of gene-driven drug that fueled the first effective COVID-19 vaccines.
Humans have 46 chromosomes, in 23 pairs, which represent tens of thousands of individual genes. Each gene is made up of a number of base pairs made of adenine (A), thymine (T), guanine (G) and cytosine (C). There are billions of base pairs in the human genome.
In June 2000, the Human Genome Project (HGP) and the private company Celera Genomics ad this first “draft” of the human genome. It is the result of years of work that picked up the pace as humans continued to make better computers and algorithms to process the genome. At the time, scientists were surprised to find that out of the more than 3 billion individual “letters” of base pairs, they estimated that humans only had 30,000 to 35,000 genes. Today that number is much lower, hovering just above 20,000.
Three years later, HGP completed its mission to map the entire human genome and defined its terms this way:
“” Finite sequence “is a technical term which means that the sequence is very precise (with less than one error per 10,000 letters) and very contiguous (the only remaining gaps corresponding to regions whose sequence cannot be resolved in such a way. reliable with current technology). “
“Current technology” is doing a lot of work here. At the time, HGP used a process called a bacterial artificial chromosome (BAC), where scientists used bacteria to clone each piece of the genome and then study them in small groups. A complete “BAC library” consists of 20,000 carefully prepared bacteria containing cloned genes.
But this BAC process inherently misses parts of the entire genome. The reason for this is a great introduction to what the new team of scientists have helped accomplish.
A breakthrough in sequencing
What is hidden in the secret 8% of the genome that the 2000 genome “project” left untouched? The base pairs in this section are made up of many, many repeated patterns that make them too difficult to study using the bacteria cloning method.
BAC and other approaches simply weren’t suitable for the remaining 8% repeats of the genome. “Current DNA sequencers, made by Illumina, take small fragments of DNA, decode them and reassemble the resulting puzzle” Statisticalby Matthew Herper reports. “It works well for most of the genome, but not in areas where the DNA code is the result of long repeating patterns.”
It makes intuitive sense; imagine counting from 1 to 50 rather than just counting 1, 2, 1, 2,. . . again and again. Part of what made the BAC method so successful was that scientists were careful to minimize and match overlaps, which has become nearly impossible in the unexplored, repeat-rich part of the genome.
So what’s different about the new approaches? Let’s see what they are first. California-based Pacific Biosciences (PacBio) and UK-based Oxford Nanopore have different technologies, but are rushing towards the same goal.
PacBio uses a system called HiFi, where the base pairs circulate, literally in the form of circles, until they are read in full and in high fidelity – hence the name. The system is only a few years old and represents a big step forward in terms of length and precision for these longer streaks.
Oxford Nanopore, meanwhile, uses electric current in its proprietary devices. Base pair strands are squeezed through a microscopic nanopore – one molecule at a time – where a current zaps them in order to observe what kind of molecule it is. By zapping each molecule, scientists can identify the entire strand.
In the new study published in the bioRxiv biology preprint server, an international consortium of about 100 scientists used PacBio and Oxford Nanopore technologies to track down some of the remaining unknown sections of the human genome.
The amount of ground the consortium has covered is staggering. “The consortium said it has increased the number of DNA bases from 2.92 billion to 3.05 billion, or 4.5 [percent] increase. But the number of genes only increased by 0.4 [percent], to 19,969 ”, Statistical reports. This shows how strongly repeating base pair sequences in this area compare to the genes they represent.
The missing links
Sequencing sponsor St. George’s Church, a Harvard University biologist, said Statistical if this work goes through peer review successfully, it will be the first time any The vertebrate genome has been fully mapped. And the reason seems to be simply that the two new technologies can read very long chains of base pairs at once.
Why is missing genetic information so important? Well, there’s a lot of favoritism in the study of genes, with a handful of the most popular genes taking up most of the research interest and funding. Forgotten genes hold many key mechanisms that cause disease, for example.
There is a small catch, although it was also a catch with the announcement in 2000 of the first draft of the genome. Both projects studied cells that only had 23 chromosomes instead of the full 46. This is because they use cells derived from the reproductive system, where the eggs and sperm each carry half of a full chromosome load.
The cell arises from a hydatidiform mole, a kind of reproductive growth that represents an extremely early and non-viable union between a sperm and an egg cell lacking a nucleus. Choosing this cell type, which has been conserved and cultivated as a “cell line” used for research purposes, cuts the enormous sequencing work in half.
The next step is to publish the study in a peer-reviewed publication. After that, however, PacBio and Oxford seek to sequence the entire human genome with 46 chromosomes. But we may be waiting a while.
Now watch this:
This content is created and maintained by a third party, and uploaded to this page to help users provide their email addresses. You may be able to find more information about this and other similar content on piano.io