Are duplicate keys allowed in the definition of binary search trees?

Tim Merrifield picture Tim Merrifield · Nov 19, 2008 · Viewed 108.1k times · Source

I'm trying to find the definition of a binary search tree and I keep finding different definitions everywhere.

Some say that for any given subtree the left child key is less than or equal to the root.

Some say that for any given subtree the right child key is greater than or equal to the root.

And my old college data structures book says "every element has a key and no two elements have the same key."

Is there a universal definition of a bst? Particularly in regards to what to do with trees with multiple instances of the same key.

EDIT: Maybe I was unclear, the definitions I'm seeing are

1) left <= root < right

2) left < root <= right

3) left < root < right, such that no duplicate keys exist.

Answer

Chris picture Chris · Nov 19, 2008

Many algorithms will specify that duplicates are excluded. For example, the example algorithms in the MIT Algorithms book usually present examples without duplicates. It is fairly trivial to implement duplicates (either as a list at the node, or in one particular direction.)

Most (that I've seen) specify left children as <= and right children as >. Practically speaking, a BST which allows either of the right or left children to be equal to the root node, will require extra computational steps to finish a search where duplicate nodes are allowed.

It is best to utilize a list at the node to store duplicates, as inserting an '=' value to one side of a node requires rewriting the tree on that side to place the node as the child, or the node is placed as a grand-child, at some point below, which eliminates some of the search efficiency.

You have to remember, most of the classroom examples are simplified to portray and deliver the concept. They aren't worth squat in many real-world situations. But the statement, "every element has a key and no two elements have the same key", is not violated by the use of a list at the element node.

So go with what your data structures book said!

Edit:

Universal Definition of a Binary Search Tree involves storing and search for a key based on traversing a data structure in one of two directions. In the pragmatic sense, that means if the value is <>, you traverse the data structure in one of two 'directions'. So, in that sense, duplicate values don't make any sense at all.

This is different from BSP, or binary search partition, but not all that different. The algorithm to search has one of two directions for 'travel', or it is done (successfully or not.) So I apologize that my original answer didn't address the concept of a 'universal definition', as duplicates are really a distinct topic (something you deal with after a successful search, not as part of the binary search.)