What is Bioinformatics?
Eukaryotic Genomic DNA
Expressed Sequence Tag
Basic Local Alignment Search Tool
Gene Comparison Example
What is Bioinformatics?
Bioinformatics is the field of studying the vast quantities of data generated in biology using appropriate technological tools. For example, there are hundreds of thousands, millions and tens of millions of pieces of data under different categories: published research papers, genes, protein information, documented variations and other information of different biological species, molecules and other categories.
By compiling these databases scientists are able to easily access any detail without having to keep repeating experiments or checks required to validate new data. These databases are growing exponentially.
There are many actions that can be taken on this data digitally without the need for any biological materials or labs. For example, genomic data can be used to study genes for different species. They can be compared to see how similar their protein products are, and therefore establish how closely related they are.
Eukaryotic genomic DNA contains introns and non-coding sequences. By generating cDNA (complementary DNA), we can see the coding sequence only. This is what makes the protein of interest.
An expressed sequence tag (EST) is a randomly sequenced part of this cDNA. It acts as a tag or identifier for its corresponding gene.
Basic Local Alignment Search Tool
BLAST (Basic Local Alignment Search Tool, https://blast.ncbi.nlm.nih.gov/Blast.cgi) can identify similar sequences in different species by aligning them. It can compare gene sequences (BLASTn) or protein amino acid sequences (BLASTp).
You enter your sequence, for example AGGTCGATCGATGGCCATTCG, and it shows you all matches and their corresponding species on the next page of the BLAST search.
Normally you would already have a long sequence in a document that you can copy and paste into the box. BLAST is very useful for comparing newly sequenced DNA to existing sequences in the ever-expanding database. The degree of alignment between sequences can inform the function of the target gene or protein…