I am trying to derive the average case running time for deterministic linear search algorithm. The algorithm searches an element x in an unsorted array A in the order A[1], A[2], A[3]...A[n]. It stops when it finds the element x or proceeds until it reaches the end of the array. I searched on wikipedia and the answer given was (n+1)/(k+1) where k is the number of times x is present in the array. I approached in another way and am getting a different answer. Can anyone please give me the correct proof and also let me know whats wrong with my method?
E(T)= 1*P(1) + 2*P(2) + 3*P(3) ....+ n*P(n) where P(i) is the probability that
the algorithm runs for 'i' time (i.e. compares 'i' elements).
P(i)= (n-i)C(k-1) * (n-k)! / n!
Here, (n-i)C(k-1) is (n-i) Choose (k-1). As the algorithm has reached the ith
step, the rest of k-1 x's must be in the last n-i elements. Hence (n-i)C(k-i).
(n-k)! is the total number of ways of arranging the rest non x numbers, and n!
is the total number of ways of arranging the n elements in the array.
I am not getting (n+1)/(k+1) on simplifying.
You've forgotten to account for the permutations of the k
copies of x
. The correct definition of P(i)
is
P(i) = (n-i)C(k-1) * k! * (n-k)! / n! = (n-i)C(k-1) / nCk.
^^
I'll turn things over to Mathematica:
In[1]:= FullSimplify[Sum[i Binomial[n-i, k-1]/Binomial[n, k], {i, 1, n}], 0 <= k <= n]
1 + n
Out[1]= -----
1 + k
To elaborate on my comment below: assume that all elements are distinct, let X be the set of matches, and let Y be the set of non-matches. By assumption, |X| = k and |Y| = n-k. The expected number of reads is equal to the sum over elements e of the probability that e is read.
Given a set of elements S, each element in S comes before all of the others with probability 1/|S|.
An element x in X is read if and only if it comes before every other element of X, which is probability 1/k. The total contribution of elements in X is |X| (1/k) = 1.
An element y in Y is read if and only if it comes before every element of X, which is probability 1/(k+1). The total contribution of elements in Y is |Y| (1/(k+1)) = (n-k)/(k+1).
We have 1 + (n-k)/(k+1) = (n+1)/(k+1).