As previously touched on, the genome is the entirety of genetic material carried by an individual or species and varies accordingly. The database of genomes of different species is growing and includes humans (the Human Genome Project). For example, the human genome, by chromosome, is viewable here: https://www.ncbi.nlm.nih.gov/genome/?term=homo+sapiens
Simple genomes such as those of viruses can enable a relatively straightforward effort of assigning proteins to each gene in the genome, and thus creating a database of them. This is known as a proteome.
The information gleaned from a virus proteome, for example, can inform vaccination targets by selecting appropriate antigens such as elements of the viral capsid.
Other exciting synthetic biology applications can be explored such as glowing beer, synthesising specific compounds useful in medicine or manufacturing using organisms to whom that product isn’t native in an attempt to boost production or create new products.
Analysing and storing information about more complex genomes is hindered by non-coding DNA and regulatory genes. Non-coding DNA and regulatory genes take up the vast majority of this type of genome. This means that the actual protein products that genes code for are in the minority.
The proteomes corresponding to complex genomes, human included, are therefore difficult to build. Sequencing methods themselves have witnessed, and continue to witness a rapid evolution towards faster, more efficient, automated techniques that can yield tremendous amounts of data.
If we have obtained a DNA sample or a few, what next? Well, nothing much can be done with that. We must obtain exponentially more DNA to use for any purpose. And it all of course must be identical. We must essentially clone our DNA. Considered the very staple of molecular biology, this technique for multiplying DNA many-fold was invented by a chap Kary Mullis who believes in astrology.
Essentially the DNA is denatured so the 2 strands break apart, short complementary bits called primers attach to the strands, the enzyme DNA polymerase binds to the primers and initiates the assembly of a new DNA strand, and finally the process is repeated many times over in a chain reaction. This is the polymerase chain reaction, PCR.
Soon enough, the few bits of DNA become thousands, and hundreds of thousands, and millions…
For example, Sanger sequencing has been the main method of sequencing DNA and yielded many variations of itself. The basic concept follows these steps:
1. Mix copies of your target DNA to be sequenced with radioactive nucleotides (with A, T, G or C bases)
2. These nucleotides also prevent further DNA lengthening, resulting in a mixture of different sequence DNA strands complementary to the template DNA
3. e.g. AATGGC creates TTACCG, TACCG, ACCG, CCG, CG and G
4. Run the DNA mixture on a gel to separate the different strands by size
5. Infer their sequence based on the results: the radioactive reading of the different bases (A, T, C or G) alongside the size sequence of the strands (smaller strands run further down the gel while larger strands stay towards the top, where they were loaded)
The sequence obtained can then be converted into the amino acid sequence of the protein it encodes, if the sequence belongs to a gene. Looking at the amino acid sequence can be used to compare various conditions, for example if a variation of a protein amino acid sequence is associated with a blood disorder.
Any 2 given people share 99.9% of their DNA code. But the differences present in the remaining 0.01% of it are enough to enable reliable identification, with the exception of monozygotic twins. The DNA containing this is called variable number tandem repeats (VNTRs) because they are just sequences of DNA repeated many times.
Aside from genes, or coding DNA, there are non-coding regions which repeat themselves many times over in each individual, with some sequences contained within varying. This variability is less in closely related individuals. This is where the usefulness of genetic fingerprinting comes in. This covers medicine, criminology and biodiversity conservation among other things.
1. The sample DNA undergoes PCR then cleavage at multiple sites with restriction endonucleases
2. The resulting many small fragments are tagged using a radioactive molecule
3. They’re separated using gel electrophoresis and viewed using a developed photographic film
The bands exposed then undergo simple visual analysis by matching up the template DNA with other DNA that could be similar more or less, depending on situation. Above, the DNA found at a crime scene is compared with that of 3 suspects. The bands of suspect 2 are perfectly aligned with the crime scene DNA.
In the case of paternity tests, the child’s DNA fragments do not completely match their father’s, because some will be from the mother. Here, only the remaining fragments (the ones not from the mother) are matched up against potential fathers.
Mary, the mother, and the child share the first fragment; so looking at the remaining fragments of the child, Larry is the father as they share 3/3 fragments. Bob and the child only share 1/3.