We can do better than the ordered-list representation by arranging the set elements in the form of a tree. Each node of the tree holds one element of the set, called the “entry” at that node, and a link to each of two other (possibly empty) nodes. The “left” link points to elements smaller than the one at the node, and the “right” link to elements greater than the one at the node. Figure 2.16 shows some trees that represent the set ** {1, 3, 5, 7, 9, 11}**. The same set may be represented by a tree in a number of different ways. The only thing we require for a valid representation is that all elements in the left subtree be smaller than the node entry and that all elements in the right subtree be larger.

The advantage of the tree representation is this: Suppose we want to check whether a number ** x** is contained in a set. We begin by comparing

**with the entry in the top node. If**

`x`

**is less than this, we know that we need only search the left subtree; if**

`x`

**is greater, we need only search the right subtree. Now, if the tree is “balanced,” each of these subtrees will be about half the size of the original. Thus, in one step we have reduced the problem of searching a tree of size**

`x`

**to searching a tree of size**

`n`

**. Since the size of the tree is halved at each step, we should expect that the number of steps needed to search a tree of size n grows as**

`n/2`

**. (Halving the size of the problem at each step is the distinguishing characteristic of logarithmic growth, as we saw previously.) For large sets, this will be a significant speedup over the previous representations.**

`Θ(log n)`

We can represent trees by using lists. Each node will be a list of three items: the entry at the node, the left subtree, and the right subtree. A left or a right subtree of the empty list will indicate that there is no subtree connected there. We can describe this representation by the following procedures:

* Note*: We are representing sets in terms of trees, and trees in terms of lists—in effect, a data abstraction built upon a data abstraction. We can regard the procedures

**,**

`entry`

**,**

`left-branch`

**, and**

`right-branch`

**as a way of isolating the abstraction of a “binary tree” from the particular way we might wish to represent such a tree in terms of list structure.**

`make-tree`

`(define (entry tree) (car tree)) (define (left-branch tree) (cadr tree)) (define (right-branch tree) (caddr tree)) (define (make-tree entry left right) (list entry left right))`

Now we can write the ** element-of-set?** procedure using the strategy described above:

`(define (element-of-set? x set) (cond ((null? set) false) ((= x (entry set)) true) ((< x (entry set)) (element-of-set? x (left-branch set))) ((> x (entry set)) (element-of-set? x (right-branch set)))))`

Adjoining an item to a set is implemented similarly and also requires ** Θ(log n)** steps. To adjoin an item

**, we compare**

`x`

**with the node entry to determine whether**

`x`

**should be added to the right or to the left branch, and having adjoined**

`x`

**to the appropriate branch we piece this newly constructed branch together with the original entry and the other branch. If**

`x`

**is equal to the entry, we just return the node. If we are asked to adjoin**

`x`

**to an empty tree, we generate a tree that has**

`x`

**as the entry and empty right and left branches. Here is the procedure:**

`x`

`(define (adjoin-set x set) (cond ((null? set) (make-tree x '() '())) ((= x (entry set)) set) ((< x (entry set)) (make-tree (entry set) (adjoin-set x (left-branch set)) (right-branch set))) ((> x (entry set)) (make-tree (entry set) (left-branch set) (adjoin-set x (right-branch set))))))`

The above claim that searching the tree can be performed in a logarithmic number of steps rests on the assumption that the tree is “balanced,” i.e., that the left and the right subtree of every tree have approximately the same number of elements, so that each subtree contains about half the elements of its parent. But how can we be certain that the trees we construct will be balanced? Even if we start with a balanced tree, adding elements with adjoin-set may produce an unbalanced result. Since the position of a newly adjoined element depends on how the element compares with the items already in the set, we can expect that if we add elements “randomly” the tree will tend to be balanced on the average. But this is not a guarantee. For example, if we start with an empty set and adjoin the numbers 1 through 7 in sequence we end up with the highly unbalanced tree shown in Figure 2.17. In this tree all the left subtrees are empty, so it has no advantage over a simple ordered list. One way to solve this problem is to define an operation that transforms an arbitrary tree into a balanced tree with the same elements. Then we can perform this transformation after every few ** adjoin-set** operations to keep our set in balance. There are also other ways to solve this problem, most of which involve designing new data structures for which searching and insertion both can be done in

**steps. ( Examples of such structures include B-trees and red-black trees. There is a large literature on data structures devoted to this problem. See Cormen et al. 1990.)**

`Θ(log n)`

🤢