Protein Structure

CHM-530: Protein Structure Prediction
Introduction
In a postgenomic world, the heavy lifting has turned to protein-structure prediction from sequence (DNA or translated amino acid). There is a plethora of tools available to the biochemist to do just that. While these are still just predictive tools, they are an invaluable asset for understanding molecular interactions within the cell. In this assignment, you will translate a given DNA sequence and then put it through a secondary structure prediction tool. Additionally, you will be asked to explain basic aspects of translation and the types of secondary structure that the prediction tool searches for in the amino acid sequence. Use the Lehninger Principles of Biochemistry textbook or other online sources to answer the questions. 
Procedure
1. Starting with the DNA sequence listed below, go to the ExPASy Tool (https://web.expasy.org/translate/). 
a. Cut and paste the sequence into the translate tool. The output shows different open reading frames. 
b. There will be six frames giving all possible protein sequences for the offered DNA sequence. Select the longest continuous, red-highlighted sequence (open reading frame), and click on the red “M” to gain access to the amino acid sequence. Be sure it says that there are 503 amino acids. 
c. Copy the single letter amino acid code, which should start with M, in FASTA format, and paste it into the “Protein Structure Report” below. Make sure only open reading frame amino acid information is copied and pasted into this worksheet. Answer the question associated with Part 1.
2. Take that single letter amino acid code and do a secondary structure prediction on the sequence using the SCRATCH Protein Predictor website. (http://scratch.proteomics.ics.uci.edu/). 
a. Paste in your converted amino acid sequence, and select the prediction option SSpro8: Secondary Structure (8 Class). You will be asked about the output of that selection. You can also choose whatever predictor options you want for your own curiousity. You will receive an e-mail with your results, which may take up to 30 minutes. 
The DNA sequence is on the next page
  
TGCTGACCCTATGATGTATCCTATGGTCATTTATTAAGATGTTATCCTAA

AAAGTATATAACGATTTATTATAGTGTGATAGTAATACCAGAACGAGAAA

TTAGAAAATTGTAAAAAAAGAATTTTAAAATATTATGCGGCTACTTTTCC

TACAGTTTCTGCAATTTTTGCTTCTTCTTCAGCAAATGCGCATAGCATTG

CTTCTATGTCGGCATATCCTTCTGCTTTTGCTGTCATTGCTATTTTGTTG

TATGTGGATATGTGCTCCTCTCCCTCTTTAGTAGCGAAATCAGATAGTAA

CTTCTTTACCTTTAATTCGGTGGCTACTTTTCCTACAGTTTCTGCAATTT

TTGCTTCTTCTTCAGCAAATGCGCATAGCATTGCTTCTATGTCGGCATAT

CCTTCTGCTTTTGCTGTCATTGCTATTTTGTTGTATGTGGATATGTGCTC

CTCTCCCTCTTTTATTGAGAATTCTTCTAATATTTTTTCTACTTTACTGG

ATATCATAGGTATTCGTAATGAATAATCGGAAGGCAAATATATAAGCATT

TGTTAAGCTTTTTTAATACTAAATATAATTAGCATTTTTGTATTTCAACA

AAGTTTGAGATTTTTGTATTACGGAACTAAAAATCCTCTAAAAAACTTAA

CTTGTATATAAAATTCTTTCGTATAATTTCTTTGCCTCTTCATACTTCTC

CTTTGATTTGGTTTCTAATTCGTTCTTCCTTTCAGGAAGTTTTTCAGCTA

ATAGTTTTGAGAGCATATAAACTGTAGCTGTAAGCATAAATTTCTCTTTT

AATTGCTCACTTTCCTTTTTATTTAACTCTTCGAACCAATTGGTTATATA

TTCTAACCCTAAATACTTTATCATTTTCTCTAATATTCCCCTAGCATGAC

CCAATTCAACTAAGGCTTTTTCCCTAATTTTTTCAGATTCTTCCTTTTTA

TTAACCTCCTCCAGCTTTTGAGAGGAGAACAATAGTAATAAATGGTCTTC

GGAGTTAGCCATAAAAAGCTCTTTTAATCCTATTTCAGTCTGTGTCCCCT

TCATCACCTTTAAATTGTATTCATAGCTAATATACTCTTGTTAAAATAAT

GATGACTAACTCCAATACTGACCAATGATGTCGTAACCCGAAACTGAATA

AAAGTAAAATCCTTCCCTACTGAGAATATTTGTATGATAACCTCAAAAAG

AATGAAAGCCCTTGAAATTAATAGCGAAGCATTAGGCGTGCCAACATTAC

TCTTGATGGAAAACGCAGGGAGAAGTGTAAAGGATGAAATAATGAAAAGA

CTGAATTTGGACTATTCTAAAAAGGTTGTAGTATTTGCAGGAACTGGTGG

AAAAGGAGGAGACGGATTAGTAGTAGCAAGGCACCTTGCCTCGGAAGGGT

CAGAGGTTCATGTTTTACTTTTAGGCGAGAACAAACATCCGGACGCAATC

ATTAACTTGAATGCAATATATGAAATGGATTATTCTATTAGAGAAGTTAA

ACTGATAAAAGATACTGACGAATTGCAACCAGTTAAAGCTGACGTGCTTA

TAGATGCCATGTTAGGCACGGGATTTTCTGGTAAAGTTAGAGAACCATTT

AGAACAGCTATTAGAGTATTTAATCAGAGCTCTGGTTTTAAGGTTTCTAT

AGATATACCCTCTGGGATAAATGCAGACGATGAAGAACAGCAGGGAGAAC

ACGTTATTCCCGACCTAATAGTCACCTTTCATGATCTTAAGCCAGGCTTA

AAAAAATTTGAGAGTAAAGTGGTCGTCAAGAAAATAGGTATTCCTAAAGA

GGCTGAAATATATGTTGGTCCCGGTGATGTCATTGTCAATGTGAAGAAAA

GAGAGTATAACACAAAGAAAGGAGATAATGGAAGAGTTTTGATCATTGGA

GGGAATTTTACATTTAGTGGAGCCCCAACTCTATCTGCTTTGGGAGCCTT

AAGGACGGGAGCAGATCTGGTATATGTCGCATCTCCAGAGGAGACAGCTA

AGGTCATCTCTAGCTTTTCCCCTGACCTTATATCTATTAAGCTTAAGGGA

AAGAATATATCTACAGACAATTTGGATGAGCTAAAACCATGGATTGATAA

AGCTGACGTCGTAGTTGTAGGACCTGGTATGGGACAAGAAAGGGAAACTG

TAGATGCTTCCATAGAGATAGTTAGATATCTGAAAGCAAAGAATAAACCT

TCAGTCATAGATGCTGATGCGTTAAAATCAGTGGCAGGTATGGAATTATT

CCCGAATGCAGTAATAACTCCTCATGCAGGAGAATTTAAGATATATTCAG

GGGTTCAGCCTGATTCGAACATGAGAAAAAGAATTGAGCAAGTGAAGGAG

TGCTCACTGAAATGTAATTGTGTAGTACTCCTTAAGGGTTATGTTGATAT

CATAGCAGAAAAGGAAGAATTTAAACTTAATAAGACAGGAAATCCTGGAA

TGGCAGTTGGCGGTACTGGGGATACATTGACAGGAATAATTGCCTCATTT

ATGGCTCAAAAACTATCTCCATTCACTTCTGCTTACTTGGGAGCATTCGT

TAATGGTTTAGCAGGGTCTATAGCATATGAAAAACTTGGCGCACATCTAG

TTGCAACAGATATAATAGAAAACATTCCTAAGGTAATTAATGAACCTTTA

GAAGTGTTCAAGAAAAAAGTGTACAAAAGGATTTTAGATACTTAGGTTTT

ACCCCTAATTCTTTTAATAATCTCAAGTGATTTGTTTGCATGTTCTTCTG

CATTTCCTAGACCGCTCAATACCTCTATAATTTTTCCGTTTTCGTCTATG

ATAAAGGTTACTCTCTGAGCACTTGAGCCTTTCTCGTTTAGAACACCGTA

TAATTTAGCTATTTGTTTATTTGAGTCAGAAACTATAGGAAATCTGGCAC

CGCATTTGTCTGCAAAACTCTTTTGAGTTGAAACTGTATCAACACTAACA

CCTATAACTTCAGCATTTAACTGTTTAAATTGGTCATAAAGTTGTCCAAA

TTTTATGGTCTCTCTAGTACAACCAGGTGTAAACGCCTTAGGATAGAAAT

ATAGTACAACTACAGATTTGCCTCTATATGAAGATAGTTTCAATTTTCCT

ATAGTTGAATCTCCTTCAAAATCAGGAGCTTCATTTCCTTTTTCTAAAGC

CATAGATTATCTGATATAAATATATTCAGTTATGGTTTTTAACCTCTTTT

TCGCTTATGCCTTACA
CHM-530: Protein Structure Report
Name_____________________________
Part 1: Translation
Paste the single letter amino acid code in the box below. (10 points)

Questions
1. Why are there six different frames from a single DNA sequence? (5 points)
2. Why does each red-highlighted region begin with “M”? (5 points)
Part 2
Paste the Predicted Secondary Structure in the box below. Do not paste amino acid code. (10 points)

1. What do the following secondary structure designations mean? What specifically is the difference between E and B? Use “SCRATCH; A Quick Description” (http://scratch.proteomics.ics.uci.edu/explanation.html) as a reference. (7.5 points)
H: 
G: 
I: 
E: 
B: 
T: 
S:
C: 
2. From the e-mailed results from SCRATCH, what type and amount of secondary structure did the prediction tool suggest? Specifically, explain how much alpha helix and beta sheet (bridge) was found. (7.5 points)