Find kth smallest element in a binary search tree in Optimum way

bragboy picture bragboy · Feb 24, 2010 · Viewed 124.6k times · Source

I need to find the kth smallest element in the binary search tree without using any static/global variable. How to achieve it efficiently? The solution that I have in my mind is doing the operation in O(n), the worst case since I am planning to do an inorder traversal of the entire tree. But deep down I feel that I am not using the BST property here. Is my assumptive solution correct or is there a better one available ?

Answer

IVlad picture IVlad · Feb 24, 2010

Here's just an outline of the idea:

In a BST, the left subtree of node T contains only elements smaller than the value stored in T. If k is smaller than the number of elements in the left subtree, the kth smallest element must belong to the left subtree. Otherwise, if k is larger, then the kth smallest element is in the right subtree.

We can augment the BST to have each node in it store the number of elements in its left subtree (assume that the left subtree of a given node includes that node). With this piece of information, it is simple to traverse the tree by repeatedly asking for the number of elements in the left subtree, to decide whether to do recurse into the left or right subtree.

Now, suppose we are at node T:

  1. If k == num_elements(left subtree of T), then the answer we're looking for is the value in node T.
  2. If k > num_elements(left subtree of T), then obviously we can ignore the left subtree, because those elements will also be smaller than the kth smallest. So, we reduce the problem to finding the k - num_elements(left subtree of T) smallest element of the right subtree.
  3. If k < num_elements(left subtree of T), then the kth smallest is somewhere in the left subtree, so we reduce the problem to finding the kth smallest element in the left subtree.

Complexity analysis:

This takes O(depth of node) time, which is O(log n) in the worst case on a balanced BST, or O(log n) on average for a random BST.

A BST requires O(n) storage, and it takes another O(n) to store the information about the number of elements. All BST operations take O(depth of node) time, and it takes O(depth of node) extra time to maintain the "number of elements" information for insertion, deletion or rotation of nodes. Therefore, storing information about the number of elements in the left subtree keeps the space and time complexity of a BST.