In Big-O notation for tree structures: Why do some sources refer to O(logN) and some to O(h)?

Stephen Gross picture Stephen Gross · Feb 3, 2012 · Viewed 10.3k times · Source

In researching complexity for any algorithm that traverses a binary search tree, I see two different ways to express the same thing:

Version #1: The traversal algorithm at worst case compares once per height of the tree; therefore complexity is O(h).

Version #2: The traversal algorithm at worst case compares once per height of the tree; therefore complexity is O(logN).

It seems to me that the same logic is at work, yet different authors use either logN or h. Can someone explain to me why this is the case?

Answer

templatetypedef picture templatetypedef · Feb 3, 2012

The correct value for the worst-case time to search is tree is O(h), where h is the height of a tree. If you are using a balanced search tree (one where the height of the tree is O(log n)), then the lookup time is worst-case O(log n). That said, not all trees are balanced. For example, here's a tree with height n - 1:

1
 \
  2
   \
    3
     \
     ...
       \
        n

Here, h = O(n), so the lookup is O(n). It's correct to say that the lookup time is also O(h), but h ≠ O(log n) in this case and it would be erroneous to claim that the lookup time was O(log n).

In short, O(h) is the correct bound. O(log n) is the correct bound in a balanced search tree when the height is at most O(log n), but not all trees have lookup time O(log n) because they may be imbalanced.

Hope this helps!