Introduction:
- DNA sequencing is the process of determining the order of nucleotides in a DNA molecule.
- DNA sequencing is the process of determining the order of nucleotides in a DNA molecule. Nucleotides are the building blocks of DNA and are made up of a sugar, a phosphate group, and a nitrogenous base. There are four types of nitrogenous bases in DNA: adenine (A), thymine (T), cytosine (C), and guanine (G). The specific order of these bases, known as the DNA sequence, carries the genetic information that is passed from one generation to the next.
- It is an important tool in molecular biology and has many applications in areas such as genetics, medicine, and forensics.
Principle:
- The basic principle of DNA sequencing is to determine the order of nucleotides in a DNA molecule. Nucleotides are the building blocks of DNA and are made up of a sugar, a phosphate group, and a nitrogenous base. There are four types of nitrogenous bases in DNA: adenine (A), thymine (T), cytosine (C), and guanine (G). The specific order of these bases, known as the DNA sequence, carries the genetic information that is passed from one generation to the next.
- There are several methods for DNA sequencing, but the most commonly used method is called Sanger sequencing, also known as dideoxy sequencing. This method involves synthesizing a new DNA strand complementary to the template DNA strand and adding a chain-terminating nucleotide, called a dideoxy nucleotide, at specific positions along the new strand. The resulting DNA molecules are then separated according to size by gel electrophoresis, and the order of nucleotides in the template DNA strand is determined by reading the sequence of nucleotides in the resulting DNA fragments.
- Other DNA sequencing methods include pyrosequencing, which is based on the detection of pyrophosphate release during DNA synthesis, and next-generation sequencing (NGS), which uses high-throughput sequencing technologies to generate large amounts of sequence data quickly and cost-effectively.
Types:
There are several methods for DNA sequencing, including:
Sanger sequencing:
Also known as dideoxy sequencing, this method involves synthesizing a new DNA strand complementary to the template DNA strand and adding a chain-terminating nucleotide, called a dideoxy nucleotide, at specific positions along the new strand. The resulting DNA molecules are then separated according to size by gel electrophoresis, and the order of nucleotides in the template DNA strand is determined by reading the sequence of nucleotides in the resulting DNA fragments.
Pyrosequencing:
This method is based on the detection of pyrophosphate release during DNA synthesis. It involves synthesizing a new DNA strand in the presence of ATP and detecting the release of pyrophosphate as each nucleotide is incorporated into the strand. The sequence of nucleotides in the template DNA strand is determined by reading the order of nucleotides that are incorporated into the new strand.
Next-generation sequencing (NGS):
This method uses high-throughput sequencing technologies to generate large amounts of sequence data quickly and cost-effectively. There are several different NGS platforms, including Illumina, Ion Torrent, and Pacific Biosciences. NGS is often used for large-scale genomic projects, such as the Human Genome Project.
Single-molecule sequencing:
This method involves sequencing individual DNA molecules one at a time, without the need for amplification. It has the potential to provide highly accurate and long reads, but is currently more expensive and slower than other DNA sequencing methods.
Other methods:
Other DNA sequencing methods include chain termination sequencing, which is similar to Sanger sequencing but uses a different method for synthesizing the new DNA strand, and hybridization-based sequencing, which involves hybridizing a DNA molecule to a probe that carries a label or reporter molecule.
History:
The history of gene sequencing may be traced back to the early 1950s, when Watson and Crick suggested the structure of DNA for the first time. In the 1970s, scientists were able to sequence relatively short DNA fragments using the Maxam-Gilbert chemical sequencing method and the Sanger dideoxy method, which became widely used until the advent of next-generation sequencing technologies in the 2000s, which allowed for high-throughput sequencing of millions of DNA fragments simultaneously, revolutionizing our understanding of the genetic basis of health and disease.
Fig: History of Gene Sequencing
Generation of Sequencing:
First-generation sequencing
First Generation Sequencing Technologies include Sanger and Maxam-Gilbert sequencing.
In first generation sequencing, a single DNA fragment is amplified using the polymerase chain reaction (PCR) and then sequenced using a combination of dideoxynucleotide triphosphates (ddNTPs) and a special enzyme called DNA polymerase. The ddNTPs incorporate into the growing DNA chain and terminate DNA synthesis at a specific base, allowing the sequence of the fragment to be determined by the order in which the ddNTPs are incorporated.
First generation sequencing is a relatively slow and expensive process compared to second generation sequencing techniques, which can sequence many fragments simultaneously. However, it is still used in some applications because of its high accuracy and ability to sequence longer fragments than some second generation techniques.
.
Second-generation sequencing
Second generation sequencing, also known as high-throughput sequencing or next-generation sequencing (NGS), is a technique used to determine the order of nucleotides in a DNA molecule. It is called “second generation” because it represents a significant improvement over the first generation of sequencing techniques, which were slower, more expensive, and had lower throughput.
Second generation sequencing technologies are based on the concept of massively parallel sequencing, which involves dividing the DNA molecule into small fragments, amplifying these fragments, and then simultaneously sequencing many of them at once. This allows for much faster and more efficient sequencing than first generation techniques, which typically sequenced one fragment at a time.
There are several different second generation sequencing technologies available, including Illumina, Ion Ion Semiconductor Sequencing, Roche/454 Sequencing, ABI SOLiD which differ in their underlying chemistry and the platforms they use. These technologies have revolutionized the field of genomics, making it possible to sequence an entire human genome in a matter of days for a fraction of the cost of first generation techniques. Second generation sequencing is now widely used in a variety of applications, including gene expression analysis, genetic variation studies, and the identification of genetic mutations.
Fig: Different types of Gene Sequencing
Third-generation sequencing
It is a term used to describe newer DNA sequencing technologies that have been developed in recent years. These technologies are characterized by their ability to sequence long stretches of DNA, often hundreds of thousands or millions of base pairs in length, with high accuracy.
Some examples of third-generation sequencing technologies include:
Single-molecule, real-time (SMRT) sequencing: This technology, developed by Pacific Biosciences, uses single-molecule real-time (SMRT) DNA polymerase to synthesize a complementary strand of DNA using a template strand as a guide. The synthesis is monitored in real-time using a fluorescent dye, which allows for the accurate determination of the sequence of the template strand.
Long-read sequencing: This technology, developed by Oxford Nanopore Technologies, involves passing a single DNA molecule through a nanopore and measuring the electrical current changes as the nucleotides pass through the nanopore. This allows for the accurate determination of the sequence of the template strand.
Synthetic long read sequencing: This technology, developed by BioNano Genomics, involves using a combination of nanochannel arrays and DNA polymerase to synthesize and sequence long stretches of DNA.
Steps:
The steps of DNA sequencing depend on the specific method being used, but generally it includes the following steps:
Sample preparation:
The first step in DNA sequencing is to prepare the DNA sample for analysis. This usually involves purifying the DNA and amplifying it using a technique called polymerase chain reaction (PCR).
Fragmentation:
Next, the DNA sample is fragmented into smaller pieces, typically using a restriction enzyme. This allows the DNA to be sequenced in smaller, more manageable chunks.
Primer annealing:
Primers, which are short stretches of DNA, are then added to the DNA fragments. These primers serve as a starting point for the synthesis of new DNA strands.
DNA synthesis:
The DNA synthesis step involves creating a new DNA strand complementary to the template DNA strand. This is done using a DNA polymerase enzyme and nucleotides, which are the building blocks of DNA.
Fig: Steps of DNA sequencing
Chain termination:
In the chain termination step, a chain-terminating nucleotide, called a dideoxy nucleotide, is added to the new DNA strand at specific positions. This stops the synthesis of the new DNA strand, resulting in DNA fragments of different lengths.
Separation and analysis:
The resulting DNA fragments are then separated according to size by gel electrophoresis. The order of nucleotides in the template DNA strand is determined by reading the sequence of nucleotides in the resulting DNA fragments. This can be done using a variety of techniques, including capillary electrophoresis or sequencing by synthesis.
Data analysis:
Finally, the sequence data is analysed to identify any mutations or variations in the DNA sequence. This information can be used to understand the function of specific genes, identify genetic risk factors for diseases, or study the evolution of species.
Applications:
DNA sequencing is a laboratory technique used to determine the order of nucleotides in a DNA molecule which is important for a variety of applications.
Medical diagnosis and treatment: DNA sequencing can be used to identify genetic mutations that are associated with specific diseases or conditions. This information can help doctors diagnose and treat these conditions more effectively.
Genetic ancestry and genealogy: DNA sequencing can be used to trace an individual’s ancestry and genealogy by analysing DNA samples from different populations and comparing them to determine common ancestry.
Agricultural and environmental research: DNA sequencing can be used to identify and classify different species of plants and animals, as well as to study the genetic diversity within a population. This information can be used to improve agricultural practices and to protect and preserve biodiversity.
Forensics: DNA sequencing can be used to identify individuals from DNA samples left at crime scenes, as well as to match DNA samples from suspects to those found at the crime scene.
Drug development: DNA sequencing can be used to identify genes that are associated with specific diseases or conditions, which can help researchers develop new drugs or therapies to treat these conditions.
Evolutionary biology: DNA sequencing can be used to study the evolutionary relationships between different species and to understand how organisms have evolved over time.
Gene sequencing and Whole genome sequencing:
- Gene sequencing refers to the process of determining the precise order of the nucleotide base pairs in a specific gene or a segment of DNA. This can be done using a variety of techniques, such as Sanger sequencing or next-generation sequencing (NGS). Gene sequencing is typically used to study a particular gene or a small region of the genome in order to understand its function or to identify genetic variations that may be associated with a particular trait or disease.
- On the other hand, whole genome sequencing involves determining the complete DNA sequence of an organism’s genome. This includes all of the genetic material present in the organism’s chromosomes, as well as any other DNA present in the organism, such as in the mitochondria. Whole genome sequencing allows researchers to study the entire genome of an organism, providing a comprehensive view of its genetic makeup. This can be useful for a variety of purposes, such as identifying genetic risk factors for diseases, studying evolution and population genetics, and developing personalized medical treatments.
Gene sequencing and whole genome sequencing are similar in that they both involve determining the sequence of nucleotide base pairs in DNA. However, they posses some difference in them.
Scope: Gene sequencing is typically used to study a specific gene or a small region of the genome, while whole genome sequencing involves determining the complete DNA sequence of an organism’s genome.
Detail: Gene sequencing typically provides less detailed information than whole genome sequencing, as it only looks at a small portion of the genome. Whole genome sequencing, on the other hand, provides a more comprehensive view of an organism’s genetic makeup.
Applications: Gene sequencing is often used to study a particular gene or region of the genome in order to understand its function or to identify genetic variations that may be associated with a particular trait or disease. Whole genome sequencing is used for a variety of purposes, including identifying genetic risk factors for diseases, studying evolution and population genetics, and developing personalized medical treatments.
Cost: Gene sequencing is typically less expensive than whole genome sequencing, as it requires less sequencing and analysis. Whole genome sequencing requires more resources and is therefore more expensive.
Limitations:
Cost: DNA sequencing can be expensive, especially for large genomes or for sequencing many samples.
Time: DNA sequencing can take a long time, especially for large genomes or for sequencing many samples.
Complexity: DNA sequencing can be complex and requires specialized equipment and trained personnel to perform.
Quality: DNA sequencing can have errors, and the quality of the sequencing data can be affected by various factors such as the quality of the DNA sample, the efficiency of the sequencing reaction, and the accuracy of the data analysis.
Limited information: DNA sequencing can only provide information about the DNA sequence itself, and cannot provide information about other aspects of the organism or its biology, such as protein expression or function.
Ethical considerations: DNA sequencing raises ethical considerations, such as privacy and the potential for misuse of genetic information.
References:
- Munshi, A. ed., 2012. DNA sequencing: Methods and Applications. BoD–Books on Demand.
- Mardis, E. DNA sequencing technologies: 2006–2016. Nat Protoc 12, 213–218 (2017)
- Cantor, C.R., Broude, N., Sano, T., Przetakiewicz, M. and Smith, C.L., 1996. The future of DNA sequencing: methods and applications. In Mass Spectrometry in the Biological Sciences (pp. 519-533). Humana Press, Totowa, NJ.
- Bisht, S.S. and Panda, A.K., 2014. DNA sequencing: methods and applications. In Advances in biotechnology (pp. 11-23). Springer, New Delhi