sequencing
Third-generation sequencing technologies represent the latest advancements in DNA sequencing, offering new approaches that overcome the limitations of previous generations. These technologies provide long-read sequencing capabilities, enabling the sequencing of much larger DNA fragments compared to earlier methods. Examples include PacBio Sequencing, which uses a single-molecule, real-time (SMRT) approach with fluorescently labeled nucleotides, enabling long-read sequencing of DNA fragments up to tens of kilobases in length. Another technology is Oxford Nanopore sequencing, based on nanopore technology, where a single-stranded DNA molecule passes through a nanopore, and changes in electrical current are measured to determine the DNA sequence. Oxford Nanopore sequencing provides long-read lengths, portability, and real-time analysis. Third-generation sequencing methods have been summarized in Table 1 . Figure 3 describes technologies available on NGS and the type of data generated in each type of NGS assay and their brief application.
Various approaches used for genome analysis and applications of NGS, including technological platforms, data analysis, and applications. WGS, whole-genome sequencing; WES, whole-exome sequencing; Seq, sequencing; ITS, internal transcribed spacer; ChIP, chromatin immunoprecipitation; ATAC, assay for transposase-accessible chromatin; AMR, anti-microbial resistance.
The basic principle for short-read sequencing involves sequencing by synthesis based on enrichment through hybridization, amplification, or fragmentation. Whereas long-read sequencing works on sequence detection either by synthesis or by electrical voltage change/impedance, generating the current as a single base is passed through the biological membrane pore. Long-read sequencing can generate reads up to 25–30 kb, whereas short-read sequencing can generate reads around 600–700 bp. Furthermore, the amplification bias is eliminated in long-read sequencing as opposed to short-read sequencing. As the library preparation is PCR-free, the base modification such as DNA methylation can be easily detected by long-read sequencing. The introduction of high-throughput sequencing platforms has significantly reduced error rates and notably improved the accuracy of long-read sequencing technologies [ 29 , 31 ]. Short-read sequencing is useful for determining the abundance of specific sequences, profiling transcript expression, and identifying variants. However, long-read sequencing technologies excel in providing comprehensive genome coverage, enabling researchers to identify complex structural variants such as large insertions, deletions, inversions, duplications, and more [ 8 , 29 , 31 ].
Understanding complex human diseases requires data integration from multiple omics techniques such as genomics, transcriptomics, epigenomics, and proteomics. Here, we briefly describe various omics technologies that are implemented on the NGS platform:
Genomics studies using NGS profoundly analyze DNA using various approaches such as whole-genome sequencing, whole-exome sequencing, and targeted sequencing.
Whole-genome sequencing (WGS) is a powerful and comprehensive genomic analysis technique that involves determining the complete DNA sequence of an individual’s genome. It provides a detailed blueprint of an individual’s genetic makeup, encompassing all the genes, regulatory regions, and non-coding elements present in their genome. It finds its application mainly in discovery science, such as plant and animal research, cancer research, rare genetic diseases, patients with complex disease symptoms, population genetics, and novel genome assembly of eukaryotes and prokaryotes [ 32 ]. By sequencing all the DNA in an organism’s genome, WGS enables the identification of genetic variations, ranging from single-nucleotide polymorphisms (SNPs) to larger structural changes such as insertions, deletions, and rearrangements. This wealth of information obtained through WGS offers a multitude of applications in various fields [ 33 ]. WGS has two types of sequencing approaches on the basis of genome size viz. (1) large whole-genome sequencing deciphering larger genomes of >5 Mb such as eukaryotes, and (2) small whole-genome sequencing deciphering smaller genomes of <5 Mb mainly of prokaryotes. Short-read sequencing is preferred for mutation calling, while long-read sequencing is preferred for genome assemblies. Combining short and long-read sequencing for sequencing novel genomes has been successfully applied for accurate genome assembly without a reference sequence.
Whole-exome sequencing (WES) is a sequencing approach that focuses on capturing and sequencing the protein-coding regions of the genome, known as the exome. The exome represents approximately 1–2% of the entire genome but contains the majority of known disease-causing variants. By sequencing the exome, WES enables the identification of genetic variations, including single-nucleotide variants (SNVs), insertions, deletions, and copy number variations (CNVs), within protein-coding genes [ 34 , 35 ]. WES is a cost-effective alternative to WGS for rare clinical diseases with clusters of symptoms, as well as in identifying variants for population and cancer genetics [ 36 ]. WES involves the enrichment of exonic regions using hybrid capture or target-specific amplification techniques, followed by high-throughput sequencing. Various exome capture assays from NimbleGen, Agilent, Illumina, Twist, and IDT are available that are compatible with the Illumina NGS platform [ 37 ]. The bioinformatic approach used for WES data analysis is the same as that of WGS since WES is a part of WGS.
Targeted sequencing, as the name suggests, has less exploratory power than WGS or WES as it targets specific regions of the gene and is able to pick up various types of genetic variations from SNVs to small gene deletions, duplications, insertions, or gene rearrangements associated with disease phenotypes. However, advantages include cost-effectiveness and manageable data for clinicians, making clinical decisions easier with more specific disease-relevant information. It can give much deeper coverage up to 5000× for rare alleles in genetic diseases, as well as for low-abundant evolving mutant clones arising as a result of tumor heterogeneity or disease evolution in cancer [ 38 ]. The candidate gene approach or commercially available targeted panels is the result of WGS/WES projects carried out at the population scale. The germline, as well as somatic variants, can be tested using targeted NGS panels, few examples of which are listed in Table 2 . Targeted panels work on a simple approach of enrichment by amplification using pools of region-specific oligonucleotide primers. Specific size libraries that are produced are then sequenced and analyzed bioinformatically [ 39 ].
Examples of targeted panels available in research and diagnostic settings.
Disease Condition | Available Panel | Type of Inheritance | Specimen Type |
---|---|---|---|
Inherited cardiovascular defects | Cardiovascular research panel | Germline | Blood |
Arrhythmias and cardiomyopathies | Arrhythmias and cardiomyopathy research panel | Germline | Blood |
Sensitivity to pharmacological drugs | Pharmacogenomics research panel (PGex Seq panel) | Germline | Blood |
Antimicrobial treatment efficacy testing | Antimicrobial resistance research panel | Microbial gene testing | Bacterial culture |
Infertility conditions | Infertility research panel | Germline | Blood |
Homologous recombination defect analysis | HRR gene panel | Somatic | Tumor tissue |
myeloid cancers | Myeloid cancer panel | Somatic | Blood |
HIV speciation and drug resistance | HIV-Xgene panel | Pathogen detection | HIV-positive plasma |
Antimicrobial resistance in MTB | TB research panel | Pathogen detection | MTB-positive specimen |
Inborn errors of metabolism | Error of metabolism research panel | Germline | DBS/blood |
Hereditary cancers | BRACA and extended breast and ovarian cancer research panel, inherited cancer research panel | Germline | Blood |
Next-generation sequencing (NGS) has had a transformative impact on transcriptomics, revolutionizing our ability to study the transcriptome—the complete set of RNA molecules in an organism or specific cell population. NGS technologies offer high-throughput and cost-effective methods for profiling and analyzing RNA molecules, allowing researchers to gain deep insights into gene expression, alternative splicing, non-coding RNA regulation, and various biological processes and diseases [ 40 , 41 , 42 , 43 ]. Here are some key roles of NGS in transcriptomics:
Epigenomics refers to the study of epigenetic modifications, which are heritable changes in gene expression patterns that do not involve alterations in the DNA sequence [ 58 , 59 ]. The most common types of epigenetic modifications studied are DNA methylation [ 60 ], histone modification, and RNA methylation (epi-transcriptome). These chemical tags in turn alter DNA accessibility, chromatin remodeling, and nucleosome positioning [ 61 ]. These modifications are influenced by environmental factors such as nutrients, pollutants, toxicants, and inflammation [ 62 , 63 ]. The knowledge and data generated through whole-genome-wide sequencing in humans, plants, and animals [ 64 ] have helped scientists to gain better insights into these epigenetic alterations, especially DNA methylation and hydroxymethylation. Epigenetic alterations have attracted researchers’ and clinicians’ interest in complex disorders such as behavioral disorders, memory, cancer, autoimmune disease, addiction, neurodegenerative, and psychological disorders [ 65 ]. There are various platforms and assays developed to study epigenetic modifications, which have been very well described elsewhere [ 66 ]. NGS has been utilized for investigating epigenomics, as discussed below:
Metagenomics deals with direct genetic analysis of the prokaryotic genome including bacteria, fungi, and viruses contained in a sample [ 78 ] either by targeted approach or adaptor ligation PCR approach for shotgun sequencing in a culture-independent manner. The hypervariable region in 16S or 18S ribosomal RNA genes of bacteria and fungi is used in the targeted approach. A blend of conserved and hypervariable regions helps in the identification of each bacterial species from the sample. Similarly, for fungal species identification, ITS1 and ITS2 regions spanning the 5.8S rRNA gene of the fungal genome are selected for amplification [ 79 ]. For viral genome sequencing, reads generated from NGS (shotgun) are again the culture-independent method for studying viral diversity, abundance, and functional potential of viruses in the environment. All filtered reads are mapped with the human reference sequence, and remaining, unmapped reads are mapped against the NCBI RefSeq viral genomic database ( Table 3 ) [ 80 ]. The targeted viral and bacterial genome panels are also available, e.g., ChapterDx for HR HPV and microbial infection detection, the HIV drug resistance panel, the AMR panel, the gastrointestinal disorder panel, etc.
Based on the nucleotide sequence similarities, pre-processed sequences are clustered at 97% similarity into operational taxonomic units (OTUs). OTUs are compared with the database to identify the microorganisms [ 81 ]. Several analysis pipelines are used for the analysis of 16S amplicon reads ( Table 3 ) [ 82 ]. For shotgun metagenomics samples, taxonomic and functional profiles can be obtained by different approaches, as elaborated in Table 3 [ 83 , 84 , 85 , 86 , 87 , 88 , 89 ]. Microbiome sequencing can identify the full spectrum of microbial species present in the sample. The results are highly quantitative, and one can study the bacterial communities over a specific interval of conditions. The NGS platform can also generate reads for low-abundance species in a sample.
NGS generates vast amounts of DNA or RNA sequences, necessitating computational methods to handle, analyze, and interpret these data. Raw sequencing data produced by NGS instruments need to be processed, analyzed, and interpreted to derive biological insights. This is where bioinformatic approaches come into play. These approaches encompass a wide range of computational methods, algorithms, and tools that handle preprocessing, alignment, variant calling, gene expression quantification, differential expression analysis, and other specialized analyses. Once processed, various computational techniques, such as de novo assembly, reference-based mapping, and transcriptome analysis, are employed to extract meaningful biological information. Furthermore, advanced bioinformatic tools facilitate the identification of genetic variations, including single-nucleotide polymorphisms (SNPs), copy number variations (CNVs), and structural variants. Integrative analyses, combining NGS data with other genomic and functional data sources, enable the exploration of gene expression and regulatory networks. The various bioinformatics tools used in NGS analysis are listed in Table 3 .
Bioinformatic steps and tools used for NGS data analysis.
Analysis | Commonly Used Tools |
---|---|
Quality check of sequences | FastQC [ ], FASTX-toolkit [ ], MultiQC [ ] |
Trimming of adaptors and low-quality bases | Trimmomatic [ ], Cutadapt [ ], fastp [ ] |
Alignment of sequence reads to reference genome | BWA [ ], Bowtie [ ], dragMAP [ ] |
Reports visualization | MultiQC [ ] |
Removal of duplicate reads | Picard [ ], Sambamba [ ] |
Variant calling (single-nucleotide polymorphisms and indels) | GATK [ ], freeBayes [ ], Platypus [ ], VarScan [ ], DeepVariant [ ], Illumina Dragen [ ] |
Filter and merge variants | bcftools [ ] |
Variant annotation | ANNOVAR [ ], ensemblVEP [ ], snpEff [ ], NIRVANA [ ] |
Structural variant calling | DELLY [ ], Lumpy [ ], Manta [ ], GRIDDS [ ], Wham [ ], Pindel [ ] |
Copy number variation (CNV) calling | CNVnator [ ], GATK gCNV [ ], cn.MOPS [ ], cnvCapSeq(targeted sequencing) [ ], ExomeDepth (CNVs from Exome) [ ] |
Alignment of reads to reference | Splice-aware aligner such as TopHat2 [ ], HISAT2 [ ], and STAR [ ] |
Transcript quantification | featureCounts [ ], HTSeq-count [ ], Salmon [ ], Kallisto [ ] |
Differential gene expression analysis enrichment of gene categories | DESeq2 [ ], EdgeR [ ], DAVID [ ], clusterProfiler [ ], Enrichr [ ] |
Sequence aligners | Bwameth [ ], BS-Seeker2 [ ], Bismark [ ] |
Methylation level quantification | MethylDackel * |
Differential methylation | Metilene [ ], BSsmooth [ ], methylKit [ ] |
Removal of PCR duplicates | Samtools [ ] |
Peak calling | MACS2 [ ], SICER2 [ ], SPP [ ] |
Peak filtering | Bedtools [ ] |
Enrichment quality control | ChipQC [ ], Phantompeakqualtools [ ] |
Enrichment comparison | diffBind [ ], MAnorm [ ], MMDiff [ ] |
Motif analysis | MemeCHiP [ ], Homer [ ], RSAT [ ] |
16S rRNAseq analysis pipelines | QIIME2 [ ], mothur [ ], USEARCH [ ] |
Ribosomal RNA databases | Greengenes [ ], Silva [ ], RDP [ ] |
Taxonomic classification | MetaPhlAn4 [ ], Kaiju [ ], Kraken [ ] |
Assembly of metagenomic reads | metaSPAdes [ ], metaIDBA [ ] |
Protein databases for taxonomic classification | NCBI non-redundant protein database [ ] |
Gene annotation | Prokka [ ], MetaGeneMark [ ] |
Databases for functional annotation of genes | COG [ ], KEGG [ ], GO [ ] |
Footnote: ANNOVAR—ANNOtate VARiation; BWA—Burrows Wheeler Aligner; cn.mops Copy Number Estimation by a Mixture Of PoissonS; COG—Clusters of Orthologous Groups of Proteins; DAVID—A Database for Annotation, Visualization and Integrated Discovery; Ensembl VEP—Ensembl Variant Effect Predictor; Fastp—Fsatq Preprocessor; GATK—Genome Analysis Tool Kit; GO—Gene Ontology; HISAT2—Hierarchical Indexing for Spliced Alignment of Transcripts; HOMER—Hypergeometric Optimization of Motif EnRichment; Htseq-count—High-Throughput Sequence Analysis in Python; KEGG: Kyoto Encyclopedia of Genes and Genomes; NCBI—National Center for Biotechnology Information; MACS: Model-Based Analysis for ChIP-Seq; MEME—Multiple EM for Motif Elicitation; Meta-IDBA—Meta-Iterative De Bruijn Graph De Novo Short-Read Assembler; MetaPhlAn—Metagenomic Phylogenetic Analysis; metaSPAdes—meta St Petersburg Genome Assembler; QIIME—Quantitative Insights Into Microbial Ecology; RDP—Ribosomal Database Project; RSAT—Regulatory Sequence Analysis tools; SICER—Spatial Clustering Approach for the Identification of ChIP-Enriched regions; SPP—The Signaling Pathways Project; STAR—Spliced Transcripts Alignment to a Reference. * Available at: https://github.com/dpryan79/MethylDackel/ (accessed on 1 June 2023). Bold represents the categories of analysis and commonly used bioinformatics tools used for NGS data analysis.
NGS has revolutionized the field of scientific research and clinical genomics due to high-throughput multiplexing. This power of NGS in translation medicine lies not only in its advanced multiplexing efficiency but also in the equally smart bioinformatic tools used for data curation followed by various reference databases that help researchers, medical practitioners, and drug designers to understand the genetic basis of the disease. Different population genome sequencing projects such as 1000 G, ExAC, ESP6500, UK 100 K, Indigenome, and gnomAD generated vast amounts of data on NGS [ 162 ]. Among the reference population databases, gnomAD is the largest and most widely used database generated from harmonized sequencing data incorporating exome and genome sequencing data from 140,000 humans. This has been widely used as a resource for estimating allele frequency in rare diseases, disease gene discovery, and the biological effect of variation [ 163 ]. This has led to the creation of knowledge bases and in turn large and small sequencing panels for major applications in clinical research and diagnostics [ 164 ]. The large gene panels find their major application in clinical research mainly in cancer genetics.
5.1.1. microbiome research.
Given the ubiquitous nature of microbes, their symbiotic, pathogenic, and commensal characteristics are of importance to humans by forming a highly functioning ecosystem. The microbiome community became an obligatory factor in our survival through evolution [ 165 ]. However, a close monitoring and comprehensive understanding of the host–microbiome and microbiome–intercommunity interactions are vital to healthy survival. The approaches include pathogen surveillance, functional dysbiosis, and therapeutic potential. Metagenomic studies have linked the gut microbiome to disorders affecting mental health [ 166 ], autoimmune diseases (rheumatoid arthritis) [ 167 ], and metabolic disorders (diabetes and obesity) [ 168 ], thus instrumental in evaluating the functional potential of the microbiome. This opens doors for more therapeutic approaches and options. Designing targeted panels to pick up mutations (aiding in antibiotic resistance tracking) or identifying the pathogenic genes followed by sequencing can help in detecting pathogens with known antimicrobial resistance. Research is also underway for the pharmacomicrobiomics of individuals requiring drug treatment. This would aid in identifying the effect of drugs on an individual’s microbiome and drug disposition by the microbiome.
The focus of NGS-based research is now extended from genomic research to the study of transcriptome, epi-transcriptome, and epigenome. Human genome-based research through WGS and WES has provided novel insights into the biological processes and has found application in wellness research; agriculture and food research; genome-wide association research studies uncovering the wide range of population genetic variants; their genetic linkage and molecular basis to various diseases, including cancer; and the study of new pathogenic/emerging variants such as SARS-CoV-2 variants in human diseases. The redefinition of the mutational landscapes in tumors has resulted in translating this information into clinical research through the ever-growing list of targeted large gene panels such as the 261 gene panel, the 400 gene panel, the TSO 500 panel from Illumina, IDT, Agilent, and Thermo Fisher. These panels assess not only SNVs but also clinically relevant CNVs and RNA fusion transcripts, TMB, and microsatellite instability (MSI) for lung cancer, breast cancer, colorectal cancer, and even for difficult cancers such as ovarian, pancreatic, renal, urothelial cancers, etc.
RNA-seq finds its application mainly in research for analyzing pathogen transcriptomic signatures [ 169 ], metastatic biomarkers, therapeutic resistance, immune microenvironment, immunotherapy, and neoantigen research in cancer [ 170 , 171 ]. With NGS, it is now possible to study single-cell behavior with respect to its differentiation, de-differentiation, proliferation, and tumorigenesis in cancer using single-cell RNA-sequencing strategies such as Smart-seq2, MATQ-seq, SUPeR-seq, Drop-seq, Seq-Well, Chromium, DroNC-seq, STRT-seq, etc. [ 172 ]. The recent new development of the RiboSeq technique can plot potential ongoing events of translation in the cytosol, which is useful in identifying potentially functional micro-peptides. This is how thousands of sORFs (small open-reading frames) were discovered in lncRNA. Thus, with transcriptomics, Ribo-seq, and MS proteomics, the bifunctional potential of RNA molecules is identified [ 173 , 174 ].
The role of epigenomics in gene regulation, the maintenance of tissue-specific expression, and developmental processes is evident from X chromosome inactivation, embryonic development, genomic imprinting, epigenetic reprogramming, cell identity establishment, and lineage specification studies. Epigenetic signatures are important biomarkers that have promise not only in cancer, malignant transformation, and metastasis but also for their clinical applicability in other disease conditions such as diabetes, neurological conditions, infectious diseases, and immune disorders [ 175 , 176 ]. The reversible nature of epigenetic changes makes them promising candidates for precision medicine in cancer and other conditions [ 164 , 176 ]. Pharmacoepigenomics is an emerging research area, where the relationship between variable drug response and epigenetic status is being studied [ 59 ]. Epi-drugs have been developed over the last 40 years, and few are in clinical practice, whereas some are in clinical trials [ 177 ]. Non-coding RNAs (ncRNAs) are gene expression regulators apart from epigenetic modifications that are being explored as drug targets. Numerous lncRNAs are subsequently identified and found to be aberrantly expressed in various tumors [ 58 ]. Increasing studies have shown miRNAs as biomarkers of multiple cancers as their abnormal quantity has been correlated with the stage of pathology and prognosis [ 178 ]. The applications of miRNA analog or anti-miRNAs have shown promising outcomes in vitro and in vivo cancer studies, suggesting that miRNA-based drugs are emerging as a novel strategy for cancer therapy [ 179 ]. Apart from cancer, multiple FDA-approved drugs exist for DMD, SMA, familial hypercholesterolemia, CMV retinitis, etc. [ 178 ].
A decisive approach is important when selecting an NGS assay. Type of variant, disease symptoms, and probable genetic associations are important aspects when selecting NGS-based tests in clinical decision making, as per recommendations by the National Comprehensive Cancer Network (NCCN), the College of American Pathologists (CAP), the American Society of Clinical Oncology (ASCO), the Association of Molecular Pathology (AMP), the American College of Medical Genetics (ACMG), and the European Society of Medical Oncology (ESMO).
The identification of the exact etiological agent in microbial infections is important for precision medicine, which has driven the approach of syndromic testing/multiple pathogen testing assays such as BioFire or multiplex PCRs. However, with the limitations of multiplexing, NGS panels are being developed that can detect any pathogen using a shotgun approach or a targeted approach (16S) from various diseased specimens or clinical isolates. These panels can not only pick up causative pathogens but can be used to identify drug-resistant mutations such as antimicrobial drug-resistant mutations and antiviral drug-resistant mutations [ 180 ]. The useful data generated through NGS on microbial identification and drug resistance genotyping, e.g., in MTB, HIV, and SARS-CoV-2 [ 181 ], have proven important for disease surveillance, disease containment, public health epidemiological studies, policy making, and rapid therapeutic interventions, as evident during the COVID-19 outbreak [ 182 ]. However, with the need for fast diagnosis, NGS, in its current form for infectious pathogen detection, cannot replace current standard point-of-care testing such as PCR, multiplex BioFire panel testing, or multiplex QPCR commercial kits.
The association of multiple genes in multifactorial disorders such as diabetes, hypercholesterolemia, infertility, etc., has been discovered in the rapidly emerging field of genomics. For example, the classical approach to comprehending the genes participating in infertility, gametogenesis, the hormonal cycle, fecundation, and embryo development would have been difficult and time-consuming. Targeted NGS panels have evolved as a result of WGS, and WES has enabled the simultaneous evaluation of multiple genes and their variants explaining the complexity of various disorders, including infertility, inherited genetic diseases, and reproductive genome testing, including NIPT (non-invasive prenatal testing), PGS/PGD (preimplantation genetic disease testing), and pediatric disorders such as developmental delay disorders, metabolic syndromes [ 183 ]. This has enabled disease treatment through personalized genome testing for the betterment of human health, preventive testing, and disease management.
NGS-based HLA typing using WGS or targeted panels over conventional HLA typing methods for organ transplant or HSCT provides more unambiguous, high-throughput, high-resolution typing results from a single platform. This approach provides complete information on all the HLA loci involved in (1) the etiopathogenesis of immune disorders such as coeliac disease, psoriasis, rheumatoid arthritis, type I diabetes, SLE, lung diseases (e.g., asthma or sarcoidosis) [ 184 ], infectious disease predispositions (e.g., HIV, hepatitis, leprosy, tuberculosis), and other conditions such as malignancies and neuropathies [ 185 ]) generating population/ancestry-based database.
Epigenetics study through methylation profiling was in fact first studied using the HLA gene, which has its epigenetic regulators located in the non-coding region such as enhancers, promoters, and UTR regions that regulate HLA gene expression. Bioinformatically, the sequence data obtained are analyzed using commercial HLA-specific software such as NGSengine or exome-data-based software such as OptiType [ 186 ], Polysolver [ 187 ], xHLA [ 188 ], and HLAminer [ 189 ] to determine the HLA types [ 190 ].
The comprehensive human genome sequencing project, WGS and WES, has identified cancer as the disease of the genome and is a multifactorial disease with non-mendelian (Somatic) origin in the majority of cases and mendelian origin in inherited cancers. Through the efforts of TCGA (The Cancer Genome Atlas) and ICGC (International Cancer Genome Consortium), the understanding of cancer and the comprehensive gene alteration data in protein-coding regions for all types of human cancers are now readily available [ 191 ].
Different enterprises, such as FoundationOne by Foundation Medicine (Cambridge, MA, USA), Oncomine by Thermo Fisher (Waltham, MA, USA), CANCERPLEX by KEW (Cambridge, MA, USA), MSK-IMPACT by the Memorial Sloan Kettering Cancer Center (New York, NY, USA), OmniSeq Advance by the Roswell Park Cancer Institute (Buffalo, NY, USA), the CC Onco Panel by Sysmex (Kobe, Japan), and the Todai Onco Panel by Riken Genesis (Tokyo, Japan) have come up with multigene panels using TCGA and ICGC data for different NGS platforms that are now frequently used in cancer prognosis and therapeutics [ 191 ]. Figure 4 summarizes the various data integration methods for cancer diagnosis, prognosis, and therapeutics [ 192 ]. Though all alterations picked up in NGS may not find immediate application in translation medicine, they help discover the different pathways operating in cancer pathogenesis and build on the cancer genomics database. Lung cancer biomarkers have been developed for almost over a decade for the development of a commercial NGS panel of 15–21 genes for precision oncology in lung cancer, picking up all types of structural variants (SVs) on a single platform [ 193 , 194 ]. This landmark study of precision oncology in lung cancer opened the doors for various solid tumors such as CRC, breast, ovarian, endometrial, pancreatic, and even liquid tumors such as myeloid and lymphoid malignancies to use NGS panels effectively with limited sample requirement, infrastructure, and different technical and analytical expertise [ 98 ]. Thus, a comprehensive gene testing approach in cancer provides maximum treatment efficacy and reduces the window period of disease progression in a cancer patient, resulting in improved QOL (quality of life), PFS (progression-free survival), and OS (overall survival).
Role of NGS technology in cancer diagnosis, prognosis, and therapeutics using an integrative omics approach. FFPE, formalin-fixed paraffin-embedded; Bx, biopsy; AI, artificial intelligence; Ml, machine learning.
One important aspect of somatic mutation testing in cancer is tumor heterogeneity. It needs to be clearly and carefully dealt with by setting the variant calling cutoff thresholds to avoid false-positive or false-negative variant calling and reporting [ 195 ]. Being the most sensitive method of mutation detection, evolving mutant clones, the allelic burden of mutation and thus the disease progression can be determined through NGS. Liquid biopsy testing in cancer has become a very handy tool in tracking disease progression and treatment monitoring in clinical oncology using the circulating tumor DNA in a metastatic setting [ 196 ]. NGS plays a crucial role in identifying biomarkers associated with hereditary/germline cancers. For example, in the case of hereditary breast and ovarian cancer syndrome (HBOC), the understanding of its genetic basis has evolved beyond the BRCA1 and BRCA2 mutations. The inclusion of other genes involved in the homologous recombination repair (HRR) pathway, known as BRACAness genes, has reshaped our understanding of HBOC. These additional genes include CDH1, PTEN, TP53, STK11, PALB2, ATM, CHEK2, MUTYH, BARD1, MRE11A, NBN, RAD50, RAD51C, RAD51D, and NF1, in addition to BRCA1 and BRCA2. NGS has facilitated the identification and characterization of these extended sets of genes associated with HBOC, expanding our knowledge of hereditary cancer predisposition [ 197 ].
Ever since 1984, when Sir Alec Jeffreys first proposed the application of DNA profiling to distinguish between different samples at a crime site, DNA analysis has emerged as a prime investigative tool in forensic science [ 198 ]. This field is now being dominated by NGS, keeping behind the old methods of DNA fingerprinting such as restriction fragment length polymorphism (RFLP), mitochondrial DNA, variable number of tandem repeat (VNTR) profiling, and short tandem repeat (STR) typing to solve an array of criminal mysteries [ 199 ]. NGS has gained rapid importance in this domain due to its ability to deliver highly accurate, reproducible, and results of the highest sensitivity from highly contaminated and degraded sample qualities received in forensic labs [ 200 ]. NGS is being applied to solve different categories of criminal cases: mtDNA for the investigation of maternal lineage [ 201 ], Y chromosome STR analysis for the identification of male DNA in a contaminated sample [ 202 ], animal and plant DNA analysis to identify important clues in poisoning cases [ 203 ], ancestry tracing [ 204 ], predicting phenotypes based on the genes [ 205 ], epigenetic analysis to identify the age of the donor DNA [ 206 ], and microRNA analysis for identifying body fluids and post-mortem interval [ 207 ]. The application of NGS in biodefense and bioterrorism involving the detection of microbial signatures at crime sites is another discipline gaining rapid attraction [ 208 , 209 ]. The major providers of NGS technology dominating the forensic domain are Illumina’s MiSeq FGx, Thermo Fisher’s Ion Torrent PGM, and Ion S5 [ 210 , 211 ]
The future scope of NGS holds tremendous potential for advancements and applications in various fields. The progress in bioinformatics, robotics, liquid handling, and nucleic acid preparation will revolutionize NGS sequencing methods, making them faster and more precise. These forthcoming sequencing platforms will necessitate smaller amounts of input DNA and reagents, scaling down to zeptoliters and even a few molecules. Additionally, they will become increasingly portable, enabling their utilization in diagnostic applications across various fields such as medical, agricultural, ecological, and other field-based settings. Taken together, NGS holds immense potential for transformative advancements across multiple domains. NGS has already revolutionized fields such as clinical diagnostics, cancer genomics, and microbial genomics, providing unprecedented insights into the genetic underpinnings of diseases and driving personalized medicine. As technology progresses, NGS is expected to play a pivotal role in areas such as single-cell genomics, long-read sequencing, epigenomics, and multi-omics integration, enabling a deeper understanding of cellular processes, disease mechanisms, and personalized treatment strategies. The development of real-time sequencing and point-of-care applications will further extend the reach of NGS, empowering rapid diagnostics and monitoring in various settings. Additionally, advancements in bioinformatics and data analysis will be crucial for extracting meaningful insights from the vast amount of NGS data generated. The higher order multiplexing will enable more samples to be processed in a shorter time and at a reduced cost supported by the advances in robotics, liquid handling, and sample processing will contribute to these advancements. Equally important will be advanced in faster and more accurate bioinformatic data analysis, as well as data transfer and storage. With ongoing technological improvements and cost reduction, NGS will become more accessible and widespread, facilitating its integration into routine clinical practice, research, agriculture, and environmental studies. The future of NGS is promising, promising to unlock new frontiers of knowledge and catalyze advancements that will have a profound impact on human health, agriculture, environmental conservation, and beyond.
The authors express their heartfelt gratitude and tribute to the late Professor Michael Green from the Department of Molecular Cell and Cancer Biology, UMass Chan Medical School, for his invaluable support and remarkable contributions to the field of molecular genetics and cancer genomics.
This research received no external funding.
Conceptualization: H.S., G.D. and S.K.M.; original draft preparation: H.S., K.J., G.D. and S.K.M.; literature search, analysis, writing, review, and editing: H.S., K.J., U.M., S.W., G.Z., S.R., R.P.T., S.B., A.K.M., G.D. and S.K.M. visualization: H.S., S.W. and R.P.T. Supervision: A.K.M., G.D. and S.K.M. All authors have read and agreed to the published version of the manuscript.
Informed consent statement, data availability statement, conflicts of interest.
The authors declare no conflict of interest.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Breaking down how the gene editing technology is being used, for the first time in the United States, to treat patients with severe medical conditions
Lila Thulin
Former Associate Editor, Special Projects
Last fall, the birth of genetically edited twin girls in China —the world’s first “designer babies”—prompted an immediate outcry in the medical science community. The change to the twins’ genomes, performed using the gene editing technology CRISPR, was intended to make the girls more resistant to H.I.V. But the edited genes may result in adverse side effects , and the International Commission on the Clinical Use of Human Germline Genome Editing is currently working on stricter and less ambiguous guidelines for editing the DNA of human embryos as a response to the rogue experiment.
Human genetic engineering has also witnessed more regulated advances. In the past 12 months, four clinical trials launched in the United States to use CRISPR to treat and potentially cure patients of serious medical conditions.
CRISPR-Cas9 is a technology derived from single-celled prokaryotic microorganisms and is composed of guide strands of RNA as well as the Cas9 enzyme, which does the "cutting." It allows scientists to make changes at highly specific locations in a cell’s genetic code by removing or replacing parts of the genome . Even tiny changes to individual genes can fundamentally alter the function of a cell. CRISPR has been used to edit all types of organisms, from humans to corn , but clinical trials represent a stride toward turning the technology into a drug or medical treatment.
The clinical trials in the U.S. are Phase 1 and 2 trials, small studies designed to demonstrate the safety and efficacy of a potential treatment. Essentially, these make-or-break trials take a drug from the laboratory to test on real patients. They’re “the first requirement for a product to end up on the market,” says Saar Gill, an assistant professor at the University of Pennsylvania’s medical school who works on genetically-edited immune cells.
While some of the diseases CRISPR therapies aim to tackle have other treatments available, part of gene editing’s allure lies in the possibility of a more effective or even permanent fix. The four U.S. clinical trials involving CRISPR have the potential to tackle cancers such as melanoma and lymphoma, sickle cell disease, and even blindness.
“As complicated and expensive as [genetic editing] is, you really are talking about the potential to cure a disease or essentially halt its progress or its adverse effect on the body forever,” Gill says.
The first clinical trial in the U.S. to use CRISPR in a treatment began last September. Led by University of Pennsylvania professor of medicine Edward Stadtmauer, it consists of genetically modifying patients’ own T cells—a type of immune cell that circulates in the blood—to make them more efficient at fighting certain kinds of cancer cells. The 18 patients will have types of relapsed cancer, like multiple myeloma or melanoma, that tend to overproduce an antigen called NY-ESO-1.
Once the T cells have been extracted from the patients’ blood, scientists will make several edits using CRISPR as well as a genetic modification technique derived from viruses like H.I.V. An added gene will cause the modified T cells to target cells with NY-ESO-1 as if it were a microscopic signal flare.
Another edit will stop T cells from producing proteins that could distract the cells from targeting NY-ESO-1. And researchers will also aim to turbo-boost the T cells by eliminating a protein called PD-1 that can prevent the T cells from killing cancer cells.
Patients will undergo chemotherapy to deplete their natural reserve of T cells, and then they’ll receive an infusion of the edited cells to replace them. The specific chemotherapy isn’t likely to affect the patients' cancers, so that step of the trial won’t complicate the study’s assessment of the usefulness of T cell therapy.
According to a spokesperson for Penn Medicine, two patients—one with multiple myeloma and one with sarcoma—have already begun treatment. The trial is scheduled to conclude in 2033 , and it will assess both safety (whether the edited T cell treatment leads to any negative side effects) and also efficacy (measured by outcomes such as whether the cancer disappears, the length of remission, and overall patient survival).
A trial helmed by Massachusetts-based Vertex Pharmaceuticals and CRISPR Therapeutics is the first CRISPR-based clinical trial in the U.S. for a condition with a clear, heritable genetic basis: sickle cell disease. The recessive condition is caused by a single base-pair change, meaning that both copies of a patient’s affected gene differ by just one genetic “letter” from a normally functioning gene. Victoria Gray, a 34-year-old woman from Mississippi who was recently profiled by NPR , was the first patient to receive CRISPR-edited stem cells as part of the trial.
The disease, which occurs most frequently in people of African descent, affects a protein called hemoglobin, which plays a critical role in helping red blood cells carry oxygen to different tissues in the body. Sickle cell causes hemoglobin proteins to clump into long fibers that warp disc-shaped red blood cells into sickle shapes. The irregularly shaped blood cells are short-lived and can’t flow smoothly through blood vessels, causing blockages, intense pain and anemia.
Like the University of Pennsylvania T cell study, the sickle cell trial involves editing a patient’s own cells ex-vivo, or outside of the body in a lab. Stem cells are collected from the bloodstream and edited with CRISPR so they will pump out high levels of fetal hemoglobin, a protein that typically dwindles to trace levels after infancy. Fetal hemoglobin (HbF) is encoded by an entirely different gene than beta-globin, the part of hemoglobin that can cause red blood cells to sickle. Adults with sickle cell whose bodies naturally make more HbF often experience less severe symptoms. Fetal hemoglobin can take one or both of sickle hemoglobin’s spots in the four-part hemoglobin molecule, substantially lowering a cell’s likelihood of adopting a sickle shape.
The trial, slated to conclude in May 2022 , will destroy participants’ unedited bone marrow cells with chemotherapy and then inject edited stem cells through a catheter in a onetime infusion. Doctors will look for the treatment to generate 20 percent or more HbF in the bloodstream for at least three months. Fetal hemoglobin normally constitutes only around 1 percent of adults’ hemoglobin supply, but previous studies have shown that proportions of fetal hemoglobin above 20 percent can keep enough cells from sickling to significantly reduce symptoms, including severe pain episodes.
If successful, the therapy would offer another option for a disease with few available treatments. The only current cure for sickle cell disease is a bone marrow transplant, but, according to the National Heart, Blood, and Lung Institute , such transplants work best in children and the likelihood of finding a marrow donor match is low. Just two FDA-approved drugs for sickle cell currently exist, aimed at ameliorating the worst of patients’ symptoms, and one of them, hydroxyurea , also works by increasing fetal hemoglobin.
The same companies behind the sickle cell treatment have also begun a trial to use CRISPR-edited T cells to treat non-responsive or relapsed non-Hodgkin’s lymphoma. This cancer of the lymphatic system plays a major role in the body’s immune response. Unlike the University of Pennsylvania trial, the study involves editing T cells from donors. The cells will be edited using CRISPR to target CD-19, a protein that marks B cells, which become malignant in some types of non-Hodgkin’s lymphoma. The edits also remove two proteins to stop a patient’s immune system from rejecting the donated T cells and to prevent the edited T cells from attacking non-cancerous cells.
A 2019 poster from the researchers explains that a prototype treatment in mice with acute leukemia stalled tumor growth for about 60 days. Additionally, lab tests showed that modified human T cells were successfully able to target and kill CD-19-marked cancer cells. For the clinical trial, which will eventually include a maximum of 95 participants, researchers will track how patients tolerate different doses of the T cell treatment and how many patients see their cancers shrink or disappear entirely. After the treatment is complete, scientists will keep tabs on patients and their survival and recurrence rates over the course of five years.
At the end of July, Cambridge, Massachusetts-based Editas Medicine, working with Irish company Allergan, announced that they’d begun enrollment in a clinical trial for EDIT-101, a treatment for a type of inherited childhood blindness known as Leber Congenital Amaurosis (LCA). It will be the first instance of a CRISPR clinical trial that conducts cellular editing within a human body, or in vivo. The trial will include about 18 participants, including patients as young as age 3, with a particular subset of LCA caused by a single genetic mutation that impairs photoreceptors. These cells in the eye convert light into signals for the brain to process.
The treatment comes in the form of an injection into the space behind the retina . A type of virus known as an adenovirus will “infect” the photoreceptor cells with DNA instructions to produce Cas9, the CRISPR enzyme , to cut the photoreceptor genome in specified locations. The edits change the photoreceptors’ DNA to fix the blindness-causing mutation, spurring the cells to regrow previously faulty light-sensing components, which should improve the patients’ vision.
Medical researchers aim to affect 10 percent or more of the targeted photoreceptor cells, the threshold that other research suggests is required to make a leap in visual acuity. Medical staff will measure patients’ vision in various ways, including an obstacle course featuring barriers with different contrast levels, a color vision test, the pupil’s response to light, and the person’s own assessment of visual change.
The EDIT-101 treatment has been tested in non-human primates and also in tiny samples of a donated human retina. In the human retina, the desired edit was made about 17 percent of the time, and scientists detected no unintended “off-target” changes.
The method of injecting a virus subretinally to treat LCA has been successful before. Jean Bennett and Albert Maguire’s treatment Luxturna doesn’t involve CRISPR, but it does use a similar viral injection to deliver a working copy of a malfunctioning gene to pigment cells in the retina. The work was recognized by Smithsonian magazine’s 2018 Ingenuity Award for life sciences.
Early clinical trials are not without risks. In 1999, an 18-year-old participant named Jesse Gelsinger died in a Phase 1 gene therapy trial—a tragedy that still lingers over the field. Gelsinger had inherited a metabolic disorder, and like other patients in the trial, received an injection straight to his liver of the ammonia-digesting gene his body lacked. Four days later, multiple organs failed , and Gelsinger was taken off life support. After his death, investigations uncovered a tangle of ethical lapses . Critics said inadequate information had been provided about the study’s risks and pointed out that a key administrator at the University of Pennsylvania center behind the study had a financial conflict of interest.
Mildred Cho, a bioethicist and professor at the Stanford School of Medicine, sits on NExTRAC , the panel that advises the National Institutes of Health (NIH) on emerging biotechnologies. She says she’s “concerned that the factors at play in Jesse Gelsinger’s death have not actually been eliminated.” Specifically, Cho is wary of the risks of clinical trials moving too quickly in an environment where patients, physician-scientists and pharmaceutical companies alike are anxious to alleviate devastating medical conditions. “I think there’s a lot of pressuring pushing these new technologies forward, and at the same time, there’s more reluctance to regulate,” she says.
In the U.S., the current scientific consensus is that CRISPR is worth the risk, particularly to treat serious diseases with few alternative options. Other gene therapies have been successful before, like the cancer treatments Kymriah and Yescarta . But unlike most other gene editing techniques, CRISPR is relatively easy to engineer and use, opening up the floodgates for possible applications. The potential of tools like CRISPR to cure currently unfixable diseases represents a “massive paradigm shift from taking a pill for the rest of your life,” Gill says.
CRISPR is no miracle cure, yet. Larger trials must follow this preliminary work before the FDA can approve any new treatment. James Wilson, the former director of the University of Pennsylvania center that ran the trial in which Jesse Gelsinger died, said in a recent interview : “It’s going to be a long road before we get to the point where editing would be deemed safe enough for diseases other than those that have really significant morbidity and mortality.”
But for conditions that often prove deadly or debilitating, a little genetic engineering, done properly, could go a long way.
Get the latest Science stories in your inbox.
Lila Thulin | | READ MORE
Lila Thulin is the former associate web editor, special projects, for Smithsonian magazine and covers a range of subjects from women's history to medicine.
An official website of the United States government
Here's how you know
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS. A lock ( Lock Locked padlock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
NSF Emerging Frontiers in Research and Innovation program funds advancements in tissue regeneration, gene therapy, DNA mobility and epigenetic editing
Nearly every cell in your body contains the exact same DNA, from your skin cells to your brain cells. But how does a cell know how and when to turn into skin, muscle or brain?
Imagine that the DNA in cells is a long, twisted ladder made of billions of tiny building blocks. This DNA ladder carries all the instructions that tell your body how to grow, function and repair itself. When stretched out, the ladder in each human cell is 2 meters long, and it is difficult to imagine how it fits inside. Chromatin is how life solves this problem. Think of chromatin as a way of organizing DNA to fit within the nucleus (the control center of a cell). Chromatin is made of DNA wrapped around special proteins called histones to form a structure that looks like beads on a string, which is then looped and tightly compacted into chromosomes. This way DNA can be packed into a small space and unpacked whenever the cell needs access to genetic information.
DNA contains both coding and non-coding sequences. Proteins, which are essential cellular building blocks and mediators, are built using instructions contained in specific segments of coding DNA known as genes. Non-coding DNA plays a crucial supporting role by controlling when and how these genes are turned on or off for expression into proteins. Many non-coding regions enable chromatin interactions that regulate its structure and dynamics. For cells to become distinct tissues, many genes must be turned on and off across different DNA regions and over time. Chromatin organization can control this process — tightly packed chromatin restricts access to genes, keeping them off, whereas loosely packed chromatin allows genes to be turned on and expressed. Chromatin organization is influenced by chemical modifications of DNA and histone proteins, which thereby affect gene expression.
Thus, chromatin not only solves the problem of fitting DNA into a cell, it also provides a mechanism for regulating how the information in DNA is used.
Even though people inherit a fixed set of genes, their expression can be influenced by many factors throughout their lives, including environmental factors such as diet, stress and exposure to pollution. This phenomenon, called epigenetics, controls the identity and function of cells, in addition to the genetic sequence in DNA. To fully understand and potentially manipulate a cell's destiny, researchers must understand both its genetics and epigenetics.
Every two years, the Office of Emerging Frontiers and Multidisciplinary Activities (EFMA) in the Directorate for Engineering at the U.S. National Science Foundation identifies out-of-the-box research topics for the NSF Emerging Frontiers in Research and Innovation (EFRI) program. Under four-year grants, interdisciplinary teams work on transformative, high-risk, high-reward projects and to tackle the biggest challenges facing the nation.
In 2018 and 2019, EFRI focused on chromatin and epigenetic engineering to find new ways to control how genes are turned on and off. Through deeper knowledge and novel tools, researchers can engineer gene expression for many applications, including combatting disease, boosting crop plant performance or developing organisms that can remediate environmental damage.
Vadim Backman focuses on understanding and controlling chromatin organization. His team developed a high-resolution genome imaging platform to visualize chromatin in 3D, enabling more accurate predictions for genome engineering outcomes.
Backman’s interdisciplinary team combines genome biology with physics to model genome functions. They classify cellular features, like DNA structure and accessibility to predict the likelihood of gene activity from chromatin edits. This precise manipulation has applications in cancer treatment, organ regeneration, injury prevention, and reversing aging.
The team is developing drugs and interventions targeting cells affected by cancer or oxygen loss from strokes or heart attacks. For example, they developed an electromagnetic simulation technique that alters chromatin and gene expression, enabling heart cells to quickly repair damaged tissue.
Megan King 's goal was to understand the relationship between chromatin structure and its functions and to engineer a device to measure changes in chromatin mobility.
King and her team discovered that a special protein complex called INO80 is an important driver of chromatin movements inside the nucleus and are engineering a device to watch chromatin interactions happening in real time inside living cells. Previous methods analyzed millions of cells in aggregate at a single time. The new device can look at what is happening in a single cell over many time points. This is crucial for understanding the complexity of tissues of many different cell types, like the brain or immune system.
Carlos Castro has made important advances with his team in delivering DNA into cells using nanostructures . Using principles from origami paper folding to create intricate designs, researchers can package genetic information very tightly within these nanostructures, enabling the delivery of even the longest genes into the nucleus. This new technology offers a safer, more cost-efficient alternative to traditional viral gene therapy, with potential applications in treating diseases and improving live cell imaging.
Additionally, DNA origami structures can control how gene products interact with cell components , enabling the manipulation of cell properties and functions. This capability could be used in tissue engineering to create artificial tissues and organs.
Charles Gersbach's project uses the cancer-associated gene called MYC as a case study to test how changes in chromatin architecture lead to changes in gene expression and tumor characteristics. The team developed new genome-editing technologies to specifically target the non-coding regulatory regions of DNA that turn genes on or off.
This approach can add or remove chemical modifications (epigenetic marks), mimicking changes that might occur in nature in response to the environment. This epigenome engineering approach, which addresses variations in the non-coding genome linked to disease susceptibilities, can improve disease interventions. An epigenetic editing company, Tune Therapeutics , was founded to develop new therapies based on this research.
Many EFRI teams leverage the NSF Research Experience and Mentoring program to provide paid research experiences and mentoring to broaden participation and include more diverse talents in engineering. Backman's team offers an opportunity for high school students and undergraduates to participate in research. King supports undergraduates from underrepresented minority groups and/or first-generation, low-income college students to begin their careers. Castro enables undergraduates to experience research merged with technology development and entrepreneurship.
The EFRI projects have yielded groundbreaking advancements in the understanding and manipulation of gene expression. Supported by interdisciplinary research and mentoring programs, these collaborative efforts have advanced scientific knowledge and fostered a new generation of scientists equipped to tackle complex challenges in genetic and epigenetic engineering.
Related stories.
DNA, or deoxyribonucleic acid, is the molecule that carries the genetic instructions used in the growth, development, functioning, and reproduction of all known living organisms and many viruses. Structurally, DNA is composed of two strands that coil around each other to form a double helix, with each strand consisting of a sugar-phosphate backbone attached to nitrogenous bases. These bases—adenine (A), thymine (T), cytosine (C), and guanine (G)—pair specifically (A with T, and C with G) to encode the genetic information. DNA’s role is pivotal in heredity and gene expression, guiding the synthesis of proteins via processes like transcription and translation. Since the discovery of its structure by Watson and Crick in 1953, DNA has been central to the fields of genetics and molecular biology, enabling advancements such as genetic engineering, forensic science, and medical diagnostics.
Scientists used light-activated droplets to reposition DNA, offering new insights into gene expression and disease…
The Las Gobas study offers new insights into the genetic isolation and disease history of…
DNA methylation, essential for regulating gene expression and cell function, is crucially maintained by CDCA7,…
An innovative study of DNA’s hidden structures may open up new approaches for the treatment…
Scientists have created the “Retro-Age” clock using ancient viral DNA markers to predict biological age,…
According to the research, these mitochondrial DNA insertions could be linked to early death. Mitochondria…
Researchers have discovered a “spatial grammar” in DNA that redefines the role of transcription factors…
Researchers from NC State and Johns Hopkins have developed a breakthrough technology that leverages DNA…
Research shows that early colonial dogs in North America reflected cultural tensions between Native Americans…
HiDEF-seq, a groundbreaking technique from NYU Langone Health, identifies early DNA changes that precede mutations,…
Scientists have created artificial double-helical complexes with properties that allow for controllable chirality switching. DNA,…
A new CRISPR-Cas9 based method called SEED/Harvest integrates the Single-Strand Annealing repair pathway to modify…
DNA is crucial for life, and its organization has been a significant scientific challenge. GROVER,…
Researchers from the LMS and LMB have discovered how the D2-I protein complex identifies and…
Researchers at the Garvan Institute have utilized artificial intelligence to identify potential cancer-causing elements within…
Recent research supports the “Out-of-Africa” theory, showing how the FTO gene variant rs1421085 T>C has…
Microorganisms reveal how our single-celled predecessors incorporated viral DNA into their own genomes. Researchers have…
A breakthrough now makes it possible to assemble the genomes of extinct species. A team…
Type above and press Enter to search. Press Esc to cancel.
Plasmodium falciparum, the parasite that causes malaria in humans, forms protrusions called “knobs” on the surface of its host red blood cell which enable it to avoid destruction and cause inflammation. (Photo courtesy of National Institutes of Health)
To analyze the genome or the genetic characteristics of a living organism, scientists typically rely on samples of millions of cells. The problem is that the DNA in each of our cells is not identical.
Until recently, the amount of DNA that could be extracted from a single cell couldn’t provide enough material for genetic analysis, but advances in single-cell genomics could be the key to solving some of the mysteries of diseases like cancer, which is the result of damage to individual cells. It could also help researchers better understand complex bodily systems like the brain and the immune system that are composed of a variety of cell types, each with their own unique genetic characteristics.
As a means to solving the problems posed by single-cell genomics, a process called whole-gene amplification is providing researchers with ways to generate sufficient quantities of DNA necessary for analysis by replicating the genetic material extracted from each cell. The process is not without its challenges, but a paper by Shiwei Liu, a Ph.D. candidate in biology in the University of Virginia’s College of Arts & Sciences; UVA biology professor Jennifer L. Güler; and others, published recently in the journal Genome Medicine, outlines an approach to whole-genome amplification resulting from a collaboration with neuroscientists in UVA’s School of Medicine that could provide an effective framework for creating new and more effective treatments for a variety of diseases.
Assistant professor of biology Jennifer Güler studies the genetic and metabolic mechanisms that allow malaria to adapt to and survive anti-malarial medications. (Photo by Molly Angevine)
Güler and Liu study the single-celled protozoan parasite, called Plasmodium, that causes malaria, a disease that kills nearly half a million people every year. There are no effective vaccines in widespread use for the disease, and one of the problems the medical community faces is that the organism can rapidly develop resistance to the drugs that have been developed to wipe it out. Güler’s team has been working to understand cellular mechanisms that allow it to survive and how genetic diversity within the parasite population affects its resistance to drugs.
“If you take them on a single-cell level, we start to appreciate that individual cells in a population of cells actually have small differences, and those small differences might not be noticeable, but they can have an impact if they disrupt how drugs or other treatments work,” Güler said.
In recent years, scientists have been finding ways to capture and extract DNA from single cells, which makes it possible to identify the small but critical differences between individual cells. However, the process requires a series of steps that create additional problems for researchers attempting to amplify the DNA, a process that involves reproducing enough identical copies of that DNA to be able to identify, or sequence, its component parts. The amplification process is especially challenging for malaria researchers.
Shiwei Liu, a UVA Ph.D. candidate in biology, was lead author on a paper published recently in the journal Genome Medicine on single-cell sequencing. (Contributed photo)
“The genome of the malaria parasite is really small, almost 300 times smaller than the human genome, so if we capture one genome from the malaria parasite, we’re starting at a much lower level than we need to be able sequence it, so we have to use a really sensitive, highly specific method to be able to amplify it,” Güler explained. “Then you sequence it, and presumably everything in that sequence of all those different copies is going to reflect that first genome, and this is where a big challenge comes in. When you make those many copies, you introduce errors, and you can’t always assume that those many copies reflect the initial genome. That’s been a big problem in single-cell genomics.”
Because the Plasmodium parasite lives in the human bloodstream, Güler and her team also needed a method that would allow them to preferentially amplify the genome of the protozoan over its host, a problem that is unique to studying organisms that live inside the cells of other organisms.
The solution came as the result of a collaboration with Mike McConnell, a neuroscientist who works as an investigator at the Lieber Institute for Brain Development Maltz Research Laboratories in Baltimore. Güler met McConnell when he worked in the UVA School of Medicine’s Department of Biochemistry and Molecular Genetics.
McConnell specializes in single-cell genome analysis for human brain cells and had already developed strategies for capturing single cells. He had also worked with Ian Burbulis, an assistant professor of biochemistry and molecular genetics at UVA, to use a method called multiple annealing and looping based amplification cycles, or MALBAC, to solve some of the problems inherent in the process of single-cell genome amplification.
Güler recognized the similarities in the challenges they were facing, and her team was able to use McConnell’s method for capturing single cells and was also able to adapt the MALBAC method for use in reproducing the Plasmodium DNA accurately while limiting the contamination that can be caused by its host’s DNA.
“The collaboration with Mike McConnell’s lab helped build the basis for our single-cell sequencing project. They not only provided the original standard protocol of the whole genome amplification method called MALBAC, but they also offered instructions to conduct the essential steps of the single-cell sequencing pipeline, including single-cell isolation, whole- genome amplification and data analysis,” Liu said.
“We had worked out all of the molecular biology steps, the enzymes to use and how to analyze it and that sort of thing, and we made some improvements to make it better, and Jenny was able to start there,” McConnell said. He gave credit to Güler and Liu for seeing the potential that his research offered.
“Jenny and Shiwei did the heavy lifting to take what we had done and make it work for malaria,” McConnell said.
“I think our collaboration with his lab from the very beginning of his project was absolutely instrumental because we could go to his lab and learn what they were doing instead of starting from ground zero, so we started at a much higher level,” Güler said. “It was a team effort that ended up being very successful because we had that head start.”
“A lot of this single-cell technology is really focused on human cells, and that’s great; we want to learn more about human health,” Güler said. “But when you have these microbes or other organisms that have more challenging genomes, we need to be able to apply these methods to those genomes, too. This is one of the first studies to suggest that we can overcome those challenges.
“It’s a start for us to understand the biology of the malaria parasite, but it’s also a start for understanding other organisms with challenging genomes.”
Russ Bahorsky
UVA College and Graduate School of Arts & Sciences
[email protected] (434) 924-5357
June 1, 2021
/content/new-approach-dna-research-could-be-key-solving-mysteries-deadly-diseases
Collaboration and teamwork ensure that our genomic advances improve health for all humans., to accelerate genomics research, we support scientists at public and private institutions around the world..
“Collaboration and teamwork ensure that our genomic advances improve health for all humans. ”
Spark scientific curiosity and engage a diverse community of learners., the more you know, the better decisions you can make about your health., about the national human genome research institute.
At NHGRI, we are focused on advances in genomics research. Building on our leadership role in the initial sequencing of the human genome, we collaborate with the world's scientific and medical communities to enhance genomic technologies that accelerate breakthroughs and improve lives. By empowering and expanding the field of genomics, we can benefit all of humankind.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Genetics is the branch of science concerned with genes, heredity, and variation in living organisms. It seeks to understand the process of trait inheritance from parents to offspring, including the molecular structure and function of genes, gene behaviour in the context of a cell or organism (e.g. dominance and epigenetics), gene distribution, and variation and change in populations.
By mining large-population genetic data sets, researchers identify the key factors controlling menopause timing, and reveal a close connection between reproductive longevity, cancer risk and new mutations in children.
By analysing the ancient genomes of individuals from Rapa Nui, researchers have overturned a contentious theory that the remote Pacific island experienced a self-inflicted population collapse before European colonization.
The region of the human genome that harbours genes encoding amylase enzymes, which are crucial for starch digestion, shows extensive structural diversity. Amylase genes have been duplicated and deleted several times in human history, and structures that contain duplicated versions of the genes were favoured by natural selection after the advent of agriculture.
Here the authors use UK Biobank data to identify 251 genetic loci associated with serum triglycerides to HDL-cholesterol ratio, a surrogate marker for insulin resistance. Key genes, including PLA2G12A , PLA2G6, and TNFAIP8 , offer potential therapeutic targets for metabolic diseases.
Immunoglobulin G (IgG) is the main isotype of antibody in human blood. Here the authors describe 14 genetic variants that affect IgG levels in blood. The data provide new insight into the regulation of humoral immunity that could be useful in the development of antibody-based therapeutics.
CRISPR homing gene drives can suppress pest populations by targeting female fertility genes, converting wild-type alleles into drive alleles in the germline of drive heterozygotes. Here the authors demonstrate a genetic pest suppression system based on dominant female-sterile doublesex alleles and show that releases of transgenic males eliminated Drosophila cage populations, with modelling showing improved performance compared to similar systems.
Kelemen et al. find that leveraging related traits improves polygenic score performance for abdominal aortic aneurysm. Health-economic modelling suggests that combining smoking and genetic risk information may improve cost-effectiveness of screening.
Identification and deletion of the mouse Klotho gene enhancer reveals HNF1b-driven, sexually dimorphic regulation of Klotho expression in the kidney and shows the effects of Klotho depletion on phenotype.
Polygenic basis for seedless grapes, x chromosome dosage shapes renal cell carcinoma risk.
Geneticists are trying to understand the elevated risks of heart and metabolic disease among people of South Asian ancestry, but some question whether a purely biological approach is best.
IMAGES
VIDEO
COMMENTS
DNA (deoxyribonucleic acid) is the nucleic acid polymer that forms the genetic code for a cell or virus. ... Research Highlights 17 May 2023 Nature Structural & Molecular Biology. Volume: 30, P ...
Researchers have developed a tool that can bend DNA strands using light. The work represents a new way to probe the genome. Shown here, from an unrelated study, are chromosomes (blue) inside a human cell nucleus. Steve Mabon, Tom Misteli, NCI Center for Cancer Research, National Cancer Institute, National Institutes of Health
In this new work, they utilized an additional component that attaches the condensate to specific locations on the DNA strands and directs their movement quickly and precisely via surface tension-mediated forces also known as capillary forces, which Princeton researchers had suggested could be ubiquitous in living cells. Previously, moving DNA ...
Researchers have developed a tool that can bend DNA strands using light. The work represents a new way to probe the genome. Shown here, from an unrelated study, are chromosomes (blue) inside a ...
Published May 10, 2023 Updated May 12, 2023. More than 20 years after scientists first released a draft sequence of the human genome, the book of life has been given a long-overdue rewrite. A more ...
The new research introduces 400 million letters to the previously sequenced DNA - an entire chromosome's worth. The full genome will allow scientists to analyze how DNA differs between people ...
A research effort led by Stanford scientists set the first Guinness World Record for the fastest DNA sequencing technique, which was used to sequence a human genome in just 5 hours and 2 minutes. ... "They told us there's this brand-new research that they were working on to try to speed up the process of diagnosis," Jenny Kunzman said ...
CRISPR-based genetic screens are providing new insights into the consequences of deficiencies in DNA damage response and repair pathways. These include insights into the regulation of homologous ...
In 1987, the New York Times Magazine characterized the Human Genome Project as the "biggest, costliest, most provocative biomedical research project in history." 2 But in the years between the ...
New technique will allow programmable manipulation of large DNA segments. A team of researchers led by Harvard and Broad Institute scientists has developed twin prime editing, a new, CRISPR-based gene-editing strategy that enables manipulation of gene-sized chunks of DNA in human cells without cutting the DNA double helix.
A new complete Y chromosome sequence just might combat this dangerous myth. Learn and share the most exciting discoveries, innovations and ideas shaping our world today. DNA coverage from ...
A Princeton research team has developed a groundbreaking tool to study chromosomes by physically moving DNA strands around. Having found and turned a key, they can access the deepest mechanisms of ...
According to Garvan's Mahdi Zeraati, the first author of the new study, the i-motif is only one of a number of DNA structures that don't take the double helix form - including A-DNA, Z-DNA, triplex DNA and Cruciform DNA - and which could also exist in our cells. Another kind of DNA structure, called G-quadruplex (G4) DNA, was first ...
The first gene-editing cure has arrived. Grateful patients are calling it "life changing.". It was only 11 years ago that scientists first developed the potent DNA-snipping technology called ...
They ended up with 99 scientists working directly on sequencing the human genome, and dozens more pitching in to make sense of the data. The researchers worked remotely through the pandemic ...
Yoonji Kim *24 was member of research team that developed tool to help researchers better understand gene expression With the flick of a light, researchers have found a way to rearrange life's basic tapestry, bending DNA strands back on themselves to reveal the material nature of the genome.Scientists have long debated about the physics of chromoso
A molecular jack-of-all-trades. DNA is much more than the genetic information it carries. It is a versatile material for creating systems with tailor-made functionalities that are having an ...
Next-generation sequencing (NGS) is a powerful tool used in genomics research. NGS can sequence millions of DNA fragments at once, providing detailed information about the structure of genomes, genetic variations, gene activity, and changes in gene behavior. Recent advancements have focused on faster and more accurate sequencing, reduced costs ...
A research team has finally completed the sequence of the human genome, filling in the last 8 percent of the genome's 3 billion nucleotides. ... The new DNA sequences in and around the centromere ...
Human genetic engineering has also witnessed more regulated advances. In the past 12 months, four clinical trials launched in the United States to use CRISPR to treat and potentially cure patients ...
Precise Genetics: New CRISPR Method Enables Efficient DNA Modification. July 30, 2024 — A research group has developed a new method that further improves the existing CRISPR/Cas technologies: it ...
Credit: Brenna Henn/UC Davis. "This uncertainty is due to limited fossil and ancient genomic data, and to the fact that the fossil record does not always align with expectations from models built using modern DNA," she said. "This new research changes the origin of species.". Research co-led by Henn and Simon Gravel of McGill University ...
DNA contains both coding and non-coding sequences. Proteins, which are essential cellular building blocks and mediators, are built using instructions contained in specific segments of coding DNA known as genes. ... An epigenetic editing company, Tune Therapeutics, was founded to develop new therapies based on this research. Empowering Future ...
DNA, or deoxyribonucleic acid, is the molecule that carries the genetic instructions used in the growth, development, functioning, and reproduction of all known living organisms and many viruses. Structurally, DNA is composed of two strands that coil around each other to form a double helix, with each strand consisting of a sugar-phosphate ...
Research Open Access 10 Sept 2024 Scientific Reports Volume: 14, P: 21119 Genomic Balancing Act: deciphering DNA rearrangements in the complex chromosomal aberration involving 5p15.2, 2q31.1, and ...
New Approach to DNA Research Could Be Key to Solving Mysteries of Deadly Diseases. By Russ Bahorsky, [email protected]. June 1, 2021. Plasmodium falciparum, the parasite that causes malaria in humans, forms protrusions called "knobs" on the surface of its host red blood cell which enable it to avoid destruction and cause inflammation.
DNA Research is the official journal of Kazusa DNA Research Institute, published by Oxford University Press and supported by funding from Chiba Prefecture, Japan. Growing Impact Factor, fully open access journal, low open access charges, and more. Volume 26, Issue 6:
About the National Human Genome Research Institute. At NHGRI, we are focused on advances in genomics research. Building on our leadership role in the initial sequencing of the human genome, we collaborate with the world's scientific and medical communities to enhance genomic technologies that accelerate breakthroughs and improve lives.
RSS Feed. Genetics is the branch of science concerned with genes, heredity, and variation in living organisms. It seeks to understand the process of trait inheritance from parents to offspring ...
Its fluorescence property probes local dynamics in DNA and RNA because the surrounding bases quench its fluorescence. 2AP-labeled probes that can bind to specific DNA or RNA sequences, enabling the detection of genetic mutations, viral RNA, or other nucleic acid-based markers associated with diseases like cancer and infectious diseases.