Hexokinase is an Important Enzyme that is Part of the Glycolytic Pathway: Bioinformatics Assignment, UoE, UK
|University||University of Edinburgh (UoE)|
For all questions illustrate your answers fully, describing what you did at every step and providing output illustrating what output was obtained.
You need to include, embedded, within your submitted work all relevant output from online servers, as appropriate, as well as a written dialogue to fully illustrate what work you carried out. For all parts describe how you obtained your data by stating the bioinformatics portal used and the search strategy. Please quote accession numbers of all database files used.
Q1. Hexokinase is an important enzyme that is part of the glycolytic pathway. There are several forms of the enzyme, one of which is highly expressed in cancer cells.
- Use the NCBI or EBI portals to retrieve one file for each of the several different forms of hexokinase found in humans. Each file should contain the complete mRNA sequence (it is not necessary to print the sequence out).
- Compile a table, similar to the one from tutorial 1 (page 6), comparing the sequence elements of the mRNA of each different hexokinase forms that you find. Include extra columns indicating the length of the protein and the receptor sub-type. Comment on your findings.
- Retrieve files for two genes of the hexokinase types you retrieved in part (a) and compare the structure of the genes (exon/intron profile). Comment on your findings.
For all parts describe how you obtained your data by stating the bioinformatics portal used and the search strategy. Accession numbers of all sequence files must be given. Any references used should be cited in your answer. Expect to retrieve about a dozen files in total
Q2. Cytochrome P450s are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It oxidizes a variety of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. In this question, you will explore the relationship of the Cytochrome P450 1A1 sequence for a number of different species.
- Using UniProt locate the sequence for the full-length human sequence. Then run a BLAST search for this sequence at the NCBI site (not from UniProt) against the Swiss-Prot protein database and identify 7 other different species of the protein with close similarity to the human sequence (make sure they are full-length sequences).
- Give the Accession number for each protein sequence identified, together with the species. Give the percentage identity for each of the 7 sequences with that of the human sequence. State E values and the length of each sequence.
- For all 8 sequences run a multiple sequence alignment using program Clustal Omega and show the alignment generated.
- How many positions along the multiple sequence alignment are fully conserved between species?.
- Display both the cladogram and phylogram trees obtained for the aligned sequences. Briefly discuss the evolutionary relationship between the 8 species as indicated by the phylogram and cladograms. Which species is the closest relation to the human species?
Q3. Detecting remote homologs with BLAST and PSI-BLAST.
- The NCBI website (http://www.ncbi.nlm.nih.gov) gives the option to run both BLAST and PSI-BLAST for a query protein sequence. For this question, you need to use the NCBI website to run both BLAST and PSI-BLAST.
- The enzyme adenosine deaminase (UniProt accession number P00813) and the enzyme guanine deaminase (UniProt accession number P76641) perform a similar function and are remote homologs, both belonging to the SCOP superfamily Metallo-dependent hydrolase. The two sequences have a percentage identity of only 15%.
- Perform a protein-protein BLAST search using the sequence for the adenosine deaminase sequence (UniProt accession number P00813) searching against the UniProtKB/Swiss-Prot database. Search the results for the guanine deaminase enzyme (UniProt accession number P76641). Now repeat using PSI-BLAST and compare your results from those obtained from protein-protein BLAST.
- Discuss what you observe from the BLAST and PSI-BLAST searches. Discuss which of the two search methods proved most effective and why. Include output as appropriate to illustrate your answer including the pairwise alignment for the two sequences generated from your work.
Q4. Use the two protein domain databases Pfam and SMART to investigate the domain structure of the protein human Integrin beta-1 (Accession code P05556). Discuss the domains located within the sequence using the two different databases and compare and contrast the results found. Discuss the functional role of the different domains located.