Today’s genetic engineers have a plethora of resources at their disposal: an ever-growing number of massive datasets available online, highly precise gene editing tools like CRISPR, and cheap gene sequencing methods. . But the proliferation of new technologies has not come with a clear roadmap to help researchers determine which genes to target, which tools to use, and how to interpret their results. So a team of scientists and engineers from Harvard’s Wyss Institute for Biologically Inspired Engineering, Harvard Medical School (HMS) and MIT Media Lab decided to create one.
The Wyss team created an integrated pipeline to perform genetic screening studies, encompassing every step of the process, from identifying target genes of interest to cloning and their rapid and efficient screening. The protocol, called Sequencing-based Target Assurance and Modular Perturbation Screening (STAMPScreen), is described in Cell reporting methods, and the associated open source algorithms are available on GitHub.
“STAMPScreen is a streamlined workflow that allows researchers to easily identify genes of interest and perform genetic screens without having to guess which tool to use or which experiments to perform to achieve desired results,” said the Corresponding author Pranam Chatterjee, Ph. D., a former graduate student of MIT Media Lab who is now researcher Carlos M. Varsavsky at HMS and the Wyss Institute. “It is fully compatible with many existing databases and systems, and we hope that many scientists can take advantage of STAMPScreen to save time and improve the quality of their results.”
Frustration is the mother of invention
Chatterjee and Christian Kramme, co-first author of the article, were frustrated. The two scientists were trying to explore the genetic foundations of different aspects of biology -; like fertility, aging and immunity -; by combining the strengths of numerical methods (think algorithms) and genetic engineering (think gene sequencing). But they continued to experience problems with the various tools and protocols they used, which are commonplace in science labs.
Algorithms that claimed to sift through an organism’s genes to identify those that have a significant impact on a given biological process could tell when a gene’s expression pattern changed, but provided no insight into the cause of this change. When they wanted to test a list of candidate genes in living cells, it wasn’t immediately clear what kind of experiment they needed to conduct. And many of the tools available to insert genes into and screen cells were expensive, time consuming, and rigid.
I was using methods known as Golden Gate and Gateway to clone genes into vectors for screening experiments, and it took me months and thousands of dollars to clone 50 genes. And using Gateway, I couldn’t physically code the genes to identify which one entered which vector, which was a critical requirement for my experimental design based on downstream sequencing. We thought there had to be a better way to do this kind of research, and when we couldn’t find one, we took on the challenge of creating it ourselves. “
Christian Kramme, co-first author of the study and graduate student, Wyss Institute for Biologically Inspired Engineering at Harvard
Kramme partnered with co-first author and Church lab member Alexandru Plesa, who was experiencing similar frustrations creating genetic vectors for his project. Kramme, Plesa and Chatterjee then went to work to define what would be needed to create an end-to-end platform for genetic testing that would work for all of their projects, from protein engineering to fertility and aging.
Pieces on the bench
Improve the early stages of genetic research -; identify the genes of interest to study -; the team created two new algorithms to meet the need for computational tools capable of analyzing and extracting information from the increasingly large datasets that are generated Going through next-generation sequencing (NGS). The first algorithm takes standard data on the level of expression of a gene and combines it with information about the state of the cell, as well as information about proteins known to interact with the gene. The algorithm assigns a high score to genes that are strongly connected to other genes and whose activity is correlated with large changes at the cellular level. The second algorithm provides higher-level information by generating networks to represent dynamic changes in gene expression during cell type differentiation, and then applying measures of centrality, such as Google’s PageRank algorithm. , to classify the main regulators of the process.
“The computer part of genetic studies is like a Jenga game: if each block in the tower represents a gene, we are looking for the genes that form the basis of the Jenga tower, the ones that support the whole. Most algorithms can only tell you which genes are in the same row as each other, but ours allow you to determine how far they are up or down the tower, so you can quickly identify those who have the greatest influence on the condition of the cell in question, ”Chatterjee said.
Once the target genes are identified, the STAMPScreen protocol moves from the laptop to the lab, where experiments are performed to disrupt these genes in cells and see what effect that disruption has on the cell. The team of researchers systematically evaluated several gene disruption tools, including complementary DNA (cDNA) and several versions of CRISPR in human-induced pluripotent stem cells (hiPSC), the first known direct comparisons made entirely in this study. very versatile but difficult cell type. .
They then created a new tool that makes it possible to use CRISPR and cDNA within the same cell to unlock synergies between the two methods. For example, CRISPR can be used to turn off the expression of all isoforms in a gene, and cDNA can be used to sequentially express each isoform individually, allowing for more nuanced genetic studies and dramatically reducing the background expression of off target genes.
Scan library barcodes
The next step in many genetic experiments is to generate a screening library to introduce genes into cells and observe their effects. Typically, gene fragments are inserted into bacterial plasmids (circular pieces of DNA) using methods that work well for small pieces of DNA, but are difficult to use when inserting genes. bigger. Many existing methods also rely on a technique called Gateway, which uses a process called lambda phage recombination and the production of a toxin to kill any bacteria that have not received a plasmid with the gene of interest. The toxin contained in these plasmids is often difficult to manipulate in the laboratory and can be inadvertently inactivated when a “barcode” sequence is added to a vector to help researchers identify the gene-carrying plasmid the vector has received. .
Kramme and Plesa were working with Gateway when they realized that these problems could be solved if they removed the toxin and replaced it with short sequences on the plasmid that would be recognized and cut by a type of enzyme called meganucleases. The meganuclease recognition sequences do not appear in the genes of any known organism, thus ensuring that the enzyme will not accidentally cut the inserted gene itself during cloning. These recognition sequences are naturally lost when a plasmid receives a gene of interest, rendering these plasmids immune to meganuclease. However, all plasmids which do not successfully receive the gene of interest retain these recognition sequences and are cut into pieces when a meganuclease is added, leaving only a pure pool of plasmids containing the inserted gene. The new method, which the researchers dubbed MegaGate, had a cloning success rate of 99.8% and also allowed them to easily code their vectors.
“MegaGate not only solves many of the problems we encountered with older cloning methods, but it is also compatible with many existing gene libraries such as TFome and hORFeome. gene library and a barcoded destination vector library, and two hours later you have your genes of interest barcoded. We’ve cloned almost 1,500 genes with it, and we haven’t had any failures yet, ”said Plesa, who is a graduate student at Wyss Institute and HMS.
Finally, the researchers demonstrated that their barcode vectors could be successfully inserted into live hiPSCs and that pools of cells could be analyzed using NGS to determine which delivered genes were expressed by the pool. They have also successfully used various methods, including RNA-Seq, TAR-Seq, and Barcode-Seq, to read both genetic barcodes and full transcriptomes of hiPSCs, allowing researchers to use the tool. that they know best.
The team anticipates that STAMPScreen could prove useful for a wide variety of studies, including studies of regulatory pathways and genes, screening for differentiation factors, characterizations of drugs and complex pathways, and modeling of mutations. STAMPScreen is also modular, allowing scientists to integrate different parts of it into their own workflows.
“There is a treasure trove of information housed in publicly available genetic data sets, but this information will only be understood if we use the right tools and methods to analyze it. STAMPScreen will help researchers access eureka moments faster and accelerate the pace of innovation in genetic engineering, ”said lead author George Church, Ph.D., a Wyss Core faculty member who is also a professor of genetics at HMS and professor of health science and technology at Harvard and MIT.
“At the Wyss Institute, we aim for hard-hitting ‘moonshot’ solutions to pressing problems, but we know that to get to the moon we must first build a rocket. This project is a great example of how our community is innovating. on the ground. fly to enable scientific breakthroughs that will change the world for the better, ”said Wyss founding director Don Ingber, MD, Ph.D., who is also the Judah Folkman Professor of Vascular Biology at HMS and the Vascular Biology Program at Boston Children’s Hospital, as well as Professor of Bioengineering at Harvard John A. Paulson School of Engineering and Applied Sciences.
Wyss Institute for Biologically Inspired Engineering at Harvard
Krammé, C., et al. (2021) An integrated pipeline for genetic screening of mammals. Cell reporting methods. doi.org/10.1016/j.crmeth.2021.100082.