Imperative BBT part 2: Binary Search Trees

2023-03-07 09:00:00 CET

First we define the type of BST labelled with integers:

struct bst_node {
  struct node node;
  int value;
};

struct bst_node *
bst_left(struct bst_node *node)
{
  return (struct bst_node *)node->node.left;
}

struct bst_node *
bst_right(struct bst_node *node)
{
  return (struct bst_node *)node->node.right;
}

struct bst_node *
bst_make(struct bst_node *self, struct bst_node *left, struct bst_node *right)
{
  return (struct bst_node*)make_tree(&self->node, &left->node, &right->node);
}

There are mostly wrappers over the struct node type.

Insertion

Insertion of a node in a BST is simple: if the tree is empty, we simply return the inserted node. Otherwise, we check if we should insert in the left or the right sub-trees by comparing the label.

Like before, the node to insert has to be provided by the caller, so that this code does not deal with allocation.

struct bst_node *
bst_insert(struct bst_node *new_node, struct bst_node *tree)
{
  if (!tree)
    return bst_make(new_node, NULL, NULL);
  
  struct bst_node *left = bst_left(tree);
  struct bst_node *right = bst_right(tree);
  
  if (new_node->value < tree->value)
    left = bst_insert(new_node, left);
  else
    right = bst_insert(new_node, right);
  
  return bst_make(tree, left, right);
}

This code is literally the same as insertion in a plain BST. However, since bst_make uses make_tree under-the-hood, we get a balanced tree without any effort!

Deletion

The story is almost the same for deletion: we implement the usual BST deletion algorithm. The deleted node, if any, is returned to the caller for memory management purposes.

struct bst_node *
bst_delete(struct bst_node *tree, int value, struct bst_node **deleted)
{
  if (!tree)
  {
    *deleted = NULL;
    return NULL;
  }
  
  struct bst_node *left = bst_left(tree);
  struct bst_node *right = bst_right(tree);
  
  if (value == tree->value)
  {
    *deleted = tree;
    return bst_join(left, right);
  }
  
  if (value < tree->value)
    left = bst_delete(left, value, deleted);
  else
    right = bst_delete(right, value, deleted);
  
  return bst_make(tree, left, right);
}

However, we need a new operation: bst_join, that constructs a tree by concatenating, in order, the nodes of two trees. This new helper is necessary because after deleting a node, we still need to rebuild a tree from its two sub-trees. We will look at it now.

The join operation

The join operation isn't specific two BSTs (it doesn't depend on labels), it can be defined on any balanced tree. We can implement it by building upon make_tree, that relieves us from thinking about balancing.

However, it can be useful to look at the height to decide which sub-tree to traverse. By recursing on the deeper one, we minimize the balancing work needed... Though this is a minor optimization.

struct node *
join_tree(struct node *left, struct node *right)
{
  if (!right)
    return left;
  if (!left)
    return right;
    
  if (left->height < right->height)
    return make_tree(right, join_tree(left, right->left), right->right);
  else
    return make_tree(left, left->left, join_tree(left->right, right));
}

And finally, we lift the operation to work on BST:

struct bst_node *
bst_join(struct bst_node *left, struct bst_node *right)
{
  return (struct bst_node*)join_tree(&left->node, &right->node);
}

Result

In this post we saw that the primitive make_tree from the first article makes the implementation of balanced BSTs trivial. The insertion and deletion algorithms are very close to the naive ones, yet we get balancing almost for free.