I'm reading CLRS and had some problem with the exercise 6.5-8.
Give an O(n lg k)-time algorithm to merge k sorted lists into one sorted list, where n is the total number of elements in all the inputs lists. (Hint: use a min0heap for k-way merging.)
The solution is, as everybody says,
1) build a k-element min-heap using the first element of the k lists,
2) extract-min() to get the smallest element from the heap and append it to the result list,
3) pick the next element from the same list as the one we just extracted from the heap. Insert it into the heap and goto 2).
I can understand the time is O(n lg k), but I don't get the correctness of the choice in step 3). It seems obvious but is there some formal proof?
The main thrust of the correctness proof is that the least element remaining is always the one to be extracted. The key invariant in support of this claim is that the priority queue contains, for each list, the least element remaining from that list. From this invariant, it follows that, since the least element remaining is also the least element remaining from its list, it is returned by extract-min.
We need to choose an element from the same list in part 3 to maintain the invariant that each list have its least element in the queue. Otherwise, we could have a situation like
1 2
3 4
where if we pull 1 from the initial queue containing 1 and 3 and replace it with 4, the next extraction will be 3, which is wrong.