When merge sort is preferred over Quick sort?

Mohamed Taher Alrefaie picture Mohamed Taher Alrefaie · Mar 23, 2015 · Viewed 24.7k times · Source

Quick sort is much better than merge sort in many cases. Though, when are the cases when merge sort might be a better solution than quick sort?

For example, merge sort works better than quick sort when data cannot be loaded to memory at once. Are there any other cases?

EDIT: Answers of the suggested duplicate question list all advantages of quick sort over merge sort. I'm asking here about the possible cases and applications that using merge sort in would be advantageous than using quick sort.

Answer

templatetypedef picture templatetypedef · Mar 23, 2015

I should probably start off by mentioning that both quicksort and mergesort can work just fine if you can't fit everything into memory at once. You can implement quicksort by choosing a pivot, then streaming elements in from disk into memory and writing elements into one of two different files based on how that element compares to the pivot. If you use a double-ended priority queue, you can actually do this even more efficiently by fitting the maximum number of possible elements into memory at once.

Others have mentioned the benefit that mergesort is worst-case O(n log n), which is definitely true. That said, you can easily modify quicksort to produce the introsort algorithm, a hybrid between quicksort, insertion sort, and heapsort, that's worst-case O(n log n) but retains the speed of quicksort in most cases.

It might be helpful to see why quicksort is usually faster than mergesort, since if you understand the reasons you can pretty quickly find some cases where mergesort is a clear winner. Quicksort usually is better than mergesort for two reasons:

  1. Quicksort has better locality of reference than mergesort, which means that the accesses performed in quicksort are usually faster than the corresponding accesses in mergesort.

  2. Quicksort uses worst-case O(log n) memory (if implemented correctly), while mergesort requires O(n) memory due to the overhead of merging.

There's one scenario, though, where these advantages disappear. Suppose you want to sort a linked list of elements. The linked list elements are scattered throughout memory, so advantage (1) disappears (there's no locality of reference). Second, linked lists can be merged with only O(1) space overhead instead of O(n) space overhead, so advantage (2) disappears. Consequently, you usually will find that mergesort is a superior algorithm for sorting linked lists, since it makes fewer total comparisons and isn't susceptible to a poor pivot choice.

Hope this helps!