As previously covered, mRNA is messenger RNA i.e. the molecule that takes the genetic information encoded by DNA (transcription) and brings it to the ribosome to initiate translation of the code into a polypeptide.
The production of mRNA in human cells is not a simple transcription of DNA, as previously seen. The pre-mRNA is the simple transcript of DNA, but the mature mRNA has to be spliced first, as well as go through a couple of post transcriptional edits before being ready for translation at the ribosome site.
Why is this significant for gene technologies? Gene tech relies on using genetic information in DNA to accomplish various feats. However, eukaryotic DNA contains introns that are not used as code for the desired product. Therefore, a sequence of DNA without introns must be produced.
This is accomplished by using a mature mRNA sequence and decoding it in reverse back to DNA. The enzyme used for this is reverse transcriptase. This produces a usable DNA sequence free of introns (more on this later).
Right then, what are these post transcriptional edits of mRNA?
In eukaryotes, genes contain non-coding sequences which must be removed before mRNA is used to produce proteins. These are called introns as opposed to exons which are coding sequences. Splicing therefore is the process of excising (cutting out) introns to be left with mRNA containing purely coding sequence.
This process can result in several different mRNA products from the same DNA sequence. If the introns and exons are arranged differently, the mRNA will code for different amino acids. It’s termed alternative splicing.
Since these two possible mRNA products code for different amino acids represented by the different colours (red-yellow-blue versus red-green-blue), the resulting protein after translation of mRNA could function differently. If an enzyme, it may affect its ability to catalyse its reactions, or its efficiency. Equally, the change could not make a difference in another scenario at all.
You might have noticed some additional bits to the final mature mRNA in the diagram, called 5′ cap and Poly-A tail. These, alongside spicing, are post transcriptional modifications.
The 5′ cap is a blocking group at the 5′ end of the mRNA (it has the same 5′ – 3′ notation as DNA to signify which end is which) made of 7-methylguanosine. It stabilises the mature mRNA and provides the starting point to the ribosome for translation.
The Poly-A tail is a sequence of around a couple hundred adenosines (AAAAAAAA…) on the 3′ end of the transcript. It also stabilises the molecule, as well as guiding it out of the nucleus and into the cytoplasm where it finds a ribosome to commence translation.
Genetic engineering of bacteria
Say we are interested in the gene for insulin. Sure, we could take it straight from people, but remember humans are eukaryotes and eukaryotes have non-coding sequences within their genes called introns. The mRNA following splicing, on the other hand, has no introns! How can we make DNA from mRNA?
Take this arbitrary bit of mRNA: UCCAUGCCAUUUGGG
If we had an enzyme which could reverse the transcription back into DNA, this time intron-free, that would be great. We do – it’s called reverse transcriptase and it produces DNA. This special case of DNA is called complementary DNA – cDNA.
cDNA via reverse transcriptase: AGGTACGGTAAACCC (remember that DNA unlike mRNA is double-stranded; not shown for simplicity)
If we wanted the portion after the second G above, is there a way we could cut the DNA? It appears so. Some microorganisms have actually evolved enzymes whose job it is to invade a host and chop its DNA up at specific sequences. These enzymes are called restriction endonucleases. Each has its own short sequence which it recognises. There is a restriction endonuclease called CviQI which has the recognition site GTAC and cuts between G and T. That fits our bill!
The other DNA strand will also have a GTAC site read in the opposite direction. Notice that the complementary sequence of GTAC backwards (starting from the C) is… GTAC! This is called a palindrome and all restriction endonucleases will have one simply by the virtue of DNA bases being complementary.
DNA samples can be amplified by PCR (detailed later on) to make more of it, before further processing in experiments or applications. Larger fragments of DNA can be amplified directly in bacteria, as they grow quickly and are easy to handle.
In vivo (inside a living organism) gene cloning involves stimulating bacteria to take up target DNA by inserting it in a circular bit of DNA they normally carry beside their “main” DNA, called a plasmid. This is often transferred horizontally between bacteria, and always passed down through the generations. Target DNA may be inserted in plasmids via PCR when restriction endonuclease leave sticky ends which can be joined back with the aid of the enzyme DNA ligase which catalyses the reaction between the sugar-phosphate groups to form phosphodiester bonds.
Plasmids also contain an antibiotic resistance gene which, if taken up successfully by bacteria, will enable their growth on a medium containing that antibiotic. This allows the selection of only bacteria which have taken up the plasmid (vector), and with it our DNA of interest.
The host cells (bacteria) can now be grown on a large scale. They will express the new DNA in the plasmid, and pass it on to their offspring cells to do the same. Shortly, there will be a massive number of bacteria producing whatever the gene of interest codes for. This could be human insulin.
The product can then be isolated and used. It is in a pure form and is in fact being used worldwide to treat Type I diabetes. This breakthrough enabled better management of the condition which had previously been treated with non-human insulin which had side effects.
Many elements can be added to a plasmid, and a few of them are standard.
Restriction sites enable the easy addition of new genes as flanked by them. A promoter indicates the start of transcription for the sequence of DNA upstream of itself. The aforementioned antibiotic resistance gene enables selection of microorganisms with this plasmid by exposure to antibiotic and observing the ones that grew regardless, due to having the resistance gene contained on the plasmid.
The origin of replication indicates the point where the plasmid itself begins its own replication. The selectable marker can be used alternatively to the antibiotic resistance gene as a form of selection, and could be any sequence that the bacterium cannot survive without. It can be more effective than the antibiotic resistance gene in the event that antibiotic-based selection reaches a point where even bacteria without this plasmid are able to mount a resistant response, hence diluting the power of this selection method.
As a precaution against unforeseen effects of genetic modification, bacteria used in research contain genes that render them unable to survive in the outside world.
Since bacteria are prokaryotic and many genes introduced for them to process are of human sequence, processes that are exclusive to eukaryotes, e.g. post-translational modification of proteins, may not be successful in bacteria. This is why an alternative cultured microorganism for these purposes is the yeast Pichia pastoris, a eukaryote. This can also improve the folding of polypeptides.
Polymerase chain reaction (PCR)
DNA can be replicated in the lab (in vitro) by isolating the individual components required, such as enzymes, and adding a template DNA to the mix.
If we have obtained a DNA sample or a few, what next? Well, nothing much can be done with that. We must obtain exponentially more DNA to use for any purpose. And it all of course must be identical. We must essentially clone our DNA. Considered the very staple of molecular biology, this technique for multiplying DNA many-fold was invented by a chap Kary Mullis who believes in astrology.
The DNA template to be amplified can be extracted from a field sample (a leaf, human saliva, cultured microorganisms, etc.) or synthesised chemically, on order.
Essentially the DNA is denatured so the 2 strands break apart, short complementary bits called primers attach to the strands, the enzyme DNA polymerase binds to the primers and initiates the assembly of a new DNA strand, and finally the process is repeated many times over in a chain reaction. This is the polymerase chain reaction, PCR.
Soon enough, the few bits of DNA become thousands, and hundreds of thousands, and millions…
The components of PCR can fit in a very small tube which is placed in a specialised thermocycler or water bath in order to expose it to these fluctuating temperatures. Thermocyclers can be programmed to run automatically on a cycle along the lines of (degrees Celsius) 90-60-70 each for a few minutes, repeated many times over e.g. 30 times. Overall, this can take a few hours to complete.
The fluctuations in temperature correspond to each step in PCR. The highest temperature is required to separate the strands. The lower, annealing temperature bring the strands closer again, in order to bind the primers required to kick-start replication by DNA polymerase, while the temperature lower than the denaturing step, but higher than the annealing step is required for the addition of nucleotides by the polymerase – extending.
These temperatures are well above most physiological conditions where enzymes like polymerase would be functional, so special polymerases are used in PCR which are heat-resistant. They were isolated from microorganisms found living in hot springs and such extreme environments.
Other ingredients of PCR include the nucleotides themselves (free and ready to be added to new DNA strands by polymerase), other optimising agents such as magnesium ions for the DNA polymerase, and water.
The increase in amount of DNA produced during PCR with each cycle of heating/cooling rises exponentially. This can be tracked using a log scale. When starting out with a tiny amount of DNA, carrying out around 30 cycles takes a few hours and produces a vast amount of DNA.
Visualising DNA with gel electrophoresis
A common method of visualising differences is gel electrophoresis which involves loading small volumes of samples on a gel and running a current across it in order to separate the samples by size.
Since the gel has a microscopic matrix inside that provides resistance against sample movement through it, the larger molecules move more slowly while the smaller fragments can move more quickly.
The positive charge is at the bottom of the tank, while the samples are loaded at the top. This way, they will move downwards towards the bottom of the gel because they have a negative charge as molecules. The current is run across the gel for around 30-60 minutes (ensuring the samples don’t run too long and hence run off the gel into the buffer solution! if that happens they are lost) after which the sample’s progression on the gel can be visualised by using a stain solution or pre-existing coloured label visible under UV light.
Mining DNA for data
Information contained in DNA can reveal blood relations between individuals, disease predispositions, ID in forensics, selection for clinical trials, ethnic migration and more.
Different parts of the genome are used for comparisons at different levels. For example, the sequences that are conserved (stay the same) over hundreds of years in specific ethnic groups are much more generic than those that are different between individuals, even those closely related.
These sequences are often non-coding parts of the genome, except of course in the case of genetic conditions that are caused by specific protein products of (coding) DNA. They can be short or long sequences, and they can repeat themselves many times over.
Single nucleotide polymorphisms (SNPs) are, as the name describes, single-base variations in the genome. Closely related individuals share more SNPs than less related ones. Where SNPs are inherited close together on a chromosome, rather than get interchanged during meiosis, a haplotype emerges. This group of SNPs can also be used to analyse patterns of relatedness between individuals and beyond.
Variable number tandem repeats are sequences that repeat themselves (e.g. ATCCGATATCCGATATCCGAT) varying numbers of times (hence, “variable number”) e.g. ATCCGATATCCGAT is 2 times; ATCCGATATCCGATATCCGAT is 3 times, and so forth.
The number of repeats for these sequences can also be an identifying feature of a genome.
DNA fingerprinting can also be carried out with a radioactive molecule and developing a film.
1. The sample DNA undergoes PCR then cleavage at multiple sites with restriction endonucleases
2. The resulting many small fragments are tagged using a radioactive molecule
3. They’re separated using gel electrophoresis and viewed using a developed photographic film
The bands exposed then undergo simple visual analysis by matching up the template DNA with other DNA that could be similar more or less, depending on situation. Above, the DNA found at a crime scene is compared with that of 3 suspects. The bands of suspect 2 are perfectly aligned with the crime scene DNA.
In the case of paternity tests, the child’s DNA fragments do not completely match their father’s, because some will be from the mother. Here, only the remaining fragments (the ones not from the mother) are matched up against potential fathers.
Mary, the mother, and the child share the first fragment; so looking at the remaining fragments of the child, Larry is the father as they share 3/3 fragments. Bob and the child only share 1/3.
Disease markers and drug response
I got some of my DNA screened for several select markers, including for Alzheimer’s disease and Parkinson’s, as well as many inherited conditions. Before I could see the results, which could tell me I am at a higher risk for some of these conditions, I had to read a statement explaining what these results could mean, not just for myself, but for members of my family too. Maybe I didn’t really care at the time whether I would be more likely to get Alzheimer’s in my old age, but suddenly I realised it might be extremely relevant for my mother or grandmother.
Genetic information can affect people’s outlook on health, lifestyle, family connections, reproduction and identity. Personally, I found out I am a carrier of a thrombosis factor associated with a 5 times higher risk of blood clotting. It won’t affect me hugely, but it might affect my genetic children much more if they receive two copies. I also found out I metabolise certain drugs quicker, and others slower. This might be useful in the future if I need to take them. Some are for epilepsy, some for diabetes, and so on.
Ancestry-wise, I expected my mother’s side to be Balkan (Romanian), and my father’s side to be Middle Eastern (Iranian) based on the region assignments at the time, representing population locations as far back as several hundred years. Indeed, I scored 43% Middle Eastern, but only 14% Balkan! Other populations included Ashkenazi Jewish, Italian and East Asian, with most of it being non-specific, vaguely European. I take it in good humour and am very proud of all these findings, but there are people who might have strong reactions to this type of knowledge about their ancestry.
The ethical implications stretch quite far and wide, up and down. The knowledge pertains to trivial matters such as earwax type and caffeine metabolism, but also significant health markers such as those for breast cancer and Alzheimer’s. They pertain to ourselves as individuals, but stretch to our immediate genetic relatives, generations above, generations below and indeed those yet to be born. This is why this information requires careful treatment.
As briefly touched upon in the introduction to this chapter, genomics (the study of genomes) is emerging as a key scientific field in terms of addressing disease and learning more about health. Within healthcare, genomics has the potential, and has already begun, to support risk prediction, prevention, diagnosis, treatment in terms of drug choice and dosage, and prognosis.
Genomic medicine started in the areas of oncology, pharmacology, rare and undiagnosed diseases and infectious disease.
Risk prediction is employed by studying associations between certain diseases and the presence of specific genes preferentially in that patient population. Sometimes, especially for rare disease that tend to have a single genetic root, it’s possible to know the mechanism by which that mutation causes a disease. However, other times this isn’t elucidated and all we can work with is the knowledge that, for whatever reason as of yet unknown, the association stands. It gives a patient a percentage increased lifetime likelihood of developing a certain disease.
One example are the BRCA1 and BRCA2 alleles whose protein products are involved in DNA repair in cells, acting as tumour suppression genes. Different variations of these genes have been linked to a 20-60% increased risk of breast and ovarian cancer.
Prevention can then take place by paying close attention, just by being aware of the increased risk, or in some cases, preventative interventions such as taking certain drugs or elective surgeries. In pharmacology, knowledge of increased risk of side effects from certain drugs can inform patients to avoid them or take an alternative drug. This ties in with treatment, and a patient’s option to take a drug they will personally have a better response to, or at a better tailored dose. For example, fast metabolism of a drug may mean they will have to take it more frequently as their body is breaking it down more quickly.
Prognosis is about knowing the likely outcome of a condition. This can connect back to the drugs taken and response to those, or refer to how a disease might develop. For example, in the case of some disease there are multiple variations in genes with different outcomes. This could be in terms of the likelihood of getting a disease, as well as in terms of disease severity and progression.
Gene delivery in eukaryotes
Delivering DNA into cells for various purposes can be achieved via viruses which naturally can infect certain cells, as well as gene guns (biolistics) for plants.
Gene therapy involves inserting a functional gene into a patient who lacks it, or needs supplementary support. This works for conditions which are caused by a single faulty gene rather than multiple genes. The vector used to deliver the gene is a harmless virus. Once the new DNA is taken up in the cell nucleus, the gene is expressed like any other gene.
One problem associated with gene therapy is the immune reaction the body has to the virus. This may cause inflammation and other potentially serious side effects. Another issue is that of maintaining the effects of the healthy gene inserted into target cells. If these are recycled quickly or don’t pass on the new gene to their offspring cells, then the therapeutic effect stops there, and the case is that multiple rounds of gene therapy must be administered.
Conditions currently treated with gene therapy include: cystic fibrosis, haemophilia, Parkinson’s disease, muscular dystrophy, sickle cell anaemia among many others. One particular drug in gene therapy, Glybera, is famous for being the most expensive drug in the world at $1.6 million per patient.
This is somatic gene therapy because only targeted cells that do not pass down their DNA through generations are genetically modified. In germ line gene therapy, the germ cells i.e. sperm and eggs are targeted, rendering the modifications open to being inherited freely and indefinitely.
Therefore, germ line gene therapy raises more ethical concerns. The effects of therapy, whether corrective, arbitrary or unpredictable, would be passed down indefinitely to later generations, and essentially be out of the control of those who originally set up the therapy.
Germ line gene therapy is therefore outlawed in many places, due to concerns including unpredictability of the effects, beyond attempting to correct genetic disease; and enabling changes in attitudes towards what genetic features should be considered diseases and which merely different features.
Somatic gene therapy such as that conducted for CF (cystic fibrosis) and SCID (severe combined immunodeficiency disease) does not come with such grand ethical challenges. However, it still raises the usual concerns that any new type of therapy would, in terms of safety and efficiency.
For CF, gene therapy involves administering, via inhalation, a copy of the CFTR gene that functions to alleviate CF symptoms. Clinical trials require CF patients to take part in order to establish safety and effectiveness. It may not necessarily benefit patients, and in the early stages may even harm them. However, this is the only way to test the therapies. The ultimate benefit may be reaped by later generations.
In the case of SCID, patients have a low-functioning or almost absent immune function. Gene therapy aims to introduce genes e.g. adenosine deaminase into white blood cells or as a supplement in the body, in order to establish a functioning immune response. A safety problem arose in one of the trials, as the gene carried by a retrovirus (acting as a vehicle) was close to an oncogene. This caused some of the participants to develop leukaemia.
Later trials adjusted the contents of the virus to decrease the likelihood of this happening again, which was successful.
Delivering DNA to plants via gene guns involves firing tiny metal pellets covered with DNA into plant cells.
Once inside the cell, the DNA delivered will be transcribed and translated by cell machinery. The protein encoded by the DNA depends on the gene included. This varies by application.
Gene technology in research and agriculture
Gene tech is used in research by creating specific mouse models. By removing a certain gene, or including another, mice are bred to present particular things. Researchers can buy diabetic mice, mice that will get heart disease, high blood pressure, hairless, albino, muscular dystrophy, you name it. If it sounds disgusting and dystopic, it’s because it is, especially as most of this type of research bears no fruit.
Regarding this mouse: “The CIEA NOG mouse® was developed by Mamoru Ito of the Central Institute for Experimental Animals (CIEA) in Japan”. The mouse is branded and marketed as an iPhone.
By using knockout mice, researchers aim to discover the impact of certain genes on disease, behaviour, function, etc. This is achieved by comparing experimental outcomes of control mice with knockout mice.
Super soya beans
The first genetically modified (GM) soya beans were introduced by Monsanto in 1994. They now occupy the majority of soya land.
The applications of this technology focused on increasing the yield of soya beans at as low a cost as possible. In time, it became apparent that various other elements could also be improved, as the use of soya spans many different products. The soya beans could be made healthier and more valuable by adding several genes foreign to itself, from bacteria and other plants (Roundup Ready Soybean). These can be delivered using a gene gun.
DuPont Pioneer is a company that developed a GM soya bean that makes the resulting soya oil more valuable. Naturally, the soya oil is very susceptible to oxidation and hence making the oil rancid. By silencing or knocking out the delta 9 and delta 12 desaturase enzymes, they made a soya bean with an altered fatty acid composition high in oleic acid and stearic acid, and low in linolenic acid. This different fatty acid profile makes it less susceptible to oxidation.
Although the scientific community has concluded that GM food is equally safe to eat as non-GM food (as tested individually for each new GM food to be marketed), public opinion is still relatively against it. GM food sparked a big debate around its advantages and disadvantages to big agriculture companies, consumers, scientists, farmers and others.
On one hand, GM food improves yield and the overall value extracted from crops. This is significant to the economy as well as areas where people struggle to have enough food to eat. Companies that develop GM food sell seeds and get to make large profits which can be used to further research GM food.
On the other hand, patenting food raises issues for small farmers in developing countries, as well as other ethical concerns of big companies owning the food source of the world. Suspicious consumers also don’t fully trust GM food, so may choose to avoid it in favour of non-GM food.
These concerns that affect multiple parties constantly vie for attention, and form the dynamic of attempting to accomplish balance in this debate.
RNA interference (RNAi)
A major component in the regulation of transcription and translation is RNA interference, notably via microRNA (miRNA) and small interfering RNA (siRNA).
miRNA is a sequence complementary to a portion of transcribed mRNA. Upon binding a complex protein, it attaches to the section of target mRNA, thus blocking translation as well as speeding up the eventual breakdown of the mRNA strand.
As for siRNA, it does what it says, it interferes and it’s small! What does it interfere with? It interferes with translation by binding to mRNA and cleaving it. This prevents it from being translated in the cytoplasm via tRNA and ribosomes to produce a polypeptide. Therefore the specific gene it codes for is not expressed.
siRNA is a short, double-stranded fragment of RNA which binds and cleaves mRNA through a RISC – RNA-inducing silencing complex. This is the same Dicer processing enzyme and the RISC protein complex involved in the miRNA pathway because miRNA and siRNA share the same machinery after they’re synthesised.
RNAi has the potential to be used in various disease treatments. A general issue with RNAi in this regard is off-target effects on genes of similar sequence. In a mouse model of liver disease treated with different RNAi protocols, almost half of the mice died.
Delivery of siRNA to the right cells, tissue or organ is also challenging as the negatively-charged molecules do not readily cross the plasma membrane of cells. Immune reaction is also an issue with RNAi therapy.
Targets that could benefit from refined versions of this therapy include cancer and viral infections like HIV and herpes simplex virus. By silencing the expression of these viral components e.g. antigens and reverse transcriptases, infection could be successfully stopped in its tracks. Alongside delivery of the RNAi therapy, another hurdle is the high mutation rate of the virus which would render the therapy ineffective, as some viruses would escape.
In the case of cancer, another challenge is the identification of good targets for therapy. This could be any gene that contributes to the growth and spread of cancer cells e.g. growth factors and oncogenes; or even generic cell cycle genes that control division e.g. genes involved in the formation of spindle fibres.