A normal search across a BST (Binary Search Tree) would look like this
bool BinaryTree::Search (int data )
{
Node *current = this->root;
while ( current != NULL )
{
if (current->data < data)
{
current = current->left;
}
else if (current->data > data)
{
current = current->right;
}
else if ( current->data == data )
{
return true;
}
}
return false;
}
You keep going down the tree, until you find a node whose value is equal to one you are looking for, or you bail out when you hit a leaf (NULL) node. If you look at the number of statements, there is one conditional check on the while, and an average of 1.5 conditional checks inside the loop. That makes it a total of 2.5 checks every iteration. On a tree with a 1000 nodes, that’s 2500 checks.
Let’s see how we can improve this. I am using the sentinel node technique for this purpose. In this case static Node * Leaf;
This is how the search will look like
static Node* Leaf;
bool BinaryTree::Search (int data )
{
Node *current = this->root;
Leaf->data = data;
while ( current->data != lead->data )
{
if (current->data < data)
{
current = current->left;
}
else
{
current = current->right;
}
}
return (current != Leaf);
}
The sentinel is a static node, and while building the tree, you point all the leaf nodes to this sentinel node instead of NULL. Before you start the search, you set the value of the sentinel node to the data you are searching for. This way you are guaranteed to get a hit. You just need to do one extra check at the end to see if the Hit node was a Node in the tree or the sentinel node. If you look at the number of conditional statements in the loop, there is one in the while statement and one inside the loop, that makes it 2 searches every iteration. You just saved half a conditional statement. Don’t underestimate this improvement. E.g. In a 1000 iteration loop you saved 500 checks.
This is extremely useful with large trees or when you are searching the tree several times or any other scenario where this call happens in a hot section.
Comments
Sorry but I don't understand
Sorry but I don't understand the example. If the tree has 1000 nodes the number of comparisons should be approx log (to the base 2) of 1000, or about 10. That's if the tree is a perfect binary tree; if it's a balanced binary tree (Knuth Vol 3, Sec 6.2.3) the max height is 1.4 x log (1000+2), or about 15.
If the tree isn't balanced, I agree the pathological case would be 1000 comparisons, but why bother having a binary tree if you aren't going to keep it balanced?
agreed that a many ifs are
agreed that a many ifs are skipped, but that doesnt
improve the complexity of your algorithm.
I like the idea though... do we have other places where we use
this kind of approach ?