Two (short) DNA sequences are compared to judge the degree of relatedness of the organisms from which they were obtained. We will consider the random variable X = number of matching bases in corresponding positions on both sequences. For simplicity, we will ignore the possibility of insertions or deletions for this problem.
The sequences to be compared are:
A T T G C T C T A T T G T G G A C T A C
A T T G C T G T A CT G A G G A C T A C
(a) Suppose that the two pieces of DNA are random and unrelated. Then what is the distribution of the random variable X (state the name of the distribution and list its parameter(s))? On the other hand, if the sequences were related,then what would be a reasonable (qualitative not quantitative)assumption to be made about the distribution parameter(s)?
(b) Formulate a null hypothesis and alternative (in terms of the distribution parameter(s) from part (a)). Our goal is to decide whether or not the two sequences are related.
Thanks.
Probability question,need help~?
If X is number of matching bases then X would be distributed Binomial(20,.25)
n=20 because thats the length of the string
p=.25 is a bit trickier. 1st I'm assuming that each letter has equal probability of occuring. second each pair is either a match (success) or not.
h0 p=.25
ha p%26lt;%26gt;.25 two tailed
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment