Assignment 4
Please note that due dates can be found in the Syllabus; submission instructions can be found on the Assignment Instructions page. In this assignment, you can submit a Google Doc (or other text editor, pictures, etc.) but also your code via Google Drive. Aki will go over the submission process in the lab.
You might consider (but it is not mandatory) using R Markdown to write your answers.
total marks.
Question 1 [points 10] Using the S. cerevisiae (Baker’s yeast) data that we imported into R in Lectures 11 and 12, show R code of how you would estimate the frequency of A, C, G, T nucleotides in coding regions only. Use only chromosome 1.
Question 2 [points 10] Using the S. cerevisiae (Baker’s yeast) data that we imported into R in Lectures 11 and 12, show R code of how you would estimate the frequency of A, C, G, T nucleotides in non-coding regions only. Use only chromosome 1. Estimate the self-transition probabilties (coding to coding, non-coding to non-coding) to transition probabilities to and from coding and non-coding.
Question 3 [points 20] Using the package in R (available in {), implement your model. The documentation for this package is here. Note that you might want to look at the function as an example. Perhaps follow the function and the example there. Show your code. Apply it back to chromosome 1. Apply it chromosome 2 too.
Questiom 4 [points 10] Compute the specificity, sensitivity and accuracy on both chromosomes individually. Comment on your findings.
Good luck!