CSC148 - Python II: Week9 - Binary Search Tree

Hi Everyone,

I feel this week's lab is overwhelming. I think I am good with recursion but I am very unfamiliar with the new Binary Search Tree ADT. BST seems very easy since it belongs to a tree ADT and has all the characteristic of a tree. The only difference is that its nodes are arranged in order "value of left subtree < value of root < value of right tree". Maybe I will feel better if I practice more in the weekend :P

The logic to build a BST exactly using the logic of build LinkedList using LListNode. Instead of construct a LinkedList this time, we construct a Tree. There are several functions that lighted my mind this week and I am going to share these with you:
1. Three functions from lecture:
"_str(b: BTNode, i: str) -> str", "_find(node:BTNode, data:object)", "_height(node)"
2. One function from lab
write recursive function iteratively (use looping)

Three functions from Lecture:

1. _str

I was wondering what's the difference between __str__ and __repr__. But since I finished reading A2 part1's requirement for __repr__ method, I started to have a little feeling of it. After I done A2 part1, I think I understand it. __str__ is only to return string of values which is clear and comfortable for human to read, while __repr__ is to generate representations which can be read by the interpreter. That's why when we use __repr__ to represent the binary search tree, "that an equivalent tree will be produced if you cut and paste the representation into a Python shell.

The first function also taught me that we can include the kind of indent we want in our string representation, by just adding the indent parameter "i" to the return statement of string operation.

def _str(b: BTNode, i: str) -> str:

"""Return a string representing self in order, indent by i"""

return ((_str(b.right, i + ' ') if b.right else '') +

i + str(b.data) + '\n' +

(_str(b.left, i + ' ') if b.left else ''))

* we will call this function by _str(b, ' '), the ' ' here indicates we want our indent type be a blank. This blank will be added to anywhere we write i in the return statement.

2. _find

The _find function indicates the main benefit of Binary SEARCH Tree. This type is very useful and fast for searching the things we want since its nodes are in order. e.g If node value is less than the item that we're searching for, next step we only search its right subtree.

def _find(node: BTNode, data: object):

"""Return the node containing data, or else None."""

if not node or node.data == data:

return node

else:

return (_find(node.left, data) if (data < node.data)else _find(node.right, data))

_find is only the helper function of BST's find, with parameter's type BTNode. We will call this function in our BST class to create a find method for BST, which is the one that we actually want to search from. Then the parameter of BST's method can be BST type, with helper _find function's parameter BTNode type.

def find(self: 'BST', data: object) -> BTNode:

"""Return node containing data, otherwise return None."""

return _find(self._root, data)

3. _height

But another feature is instead of a single class, we constructed a Node and BST class, and node class is constructed as left, root, right separately. So it is easier to write recursive function and our code is made clearer by this. Let's compare the method that returns the max height of a general tree and of a binary search tree.
def _height(tree):
"""Return length of longest path of tree"""
return 1+max([height(c) for c in t.children]) if t.children else 0

def _height(node):
"""Return length of longest path of tree rooted at node."""
return 1 + max(_height(node.left), _height(node.right)) if node else -1
This function find left subtree and right subtree separately, we are doing two recursions at the same time. Even though the efficiency is less than we do recursion on the same size, one subtree; the total time spending to find the max height is less than single recursion.

Also, notice we have 0 for base case in _height(tree) but -1 for base in _height(node), this is because we check t.children in tree, we check node(root) in node. If the node has no children, there will be no more height and no extra-added height, so we return 0. If the node does not exist, then we return

One Function From Lab:

We implemented the count_less method for BST in 4 different ways in this weeks's lab:

1. Recursive Function:
For this part, we leave our BTNode class unmodified. In BST, we implemented a nested helper function _count_less within our count_less function, and then we called our _count_less within count_less.

2. Recursive Method:
This time we added a _count_less method to our BTNode class. In BST, we call this method and applied to our self.root. This is exactly followed from the Recursive Function part, simply by removing the "_count_less(root.left, item) + 1 + _count_less(root.right, item) if root.item < item else 0" line to BTNode class.

3. Recursive Method without None:
Well, this was easy once we've successfully developed Recursive Method, but it took us some time to figure it out. HAHA~ We implemented it by just deleting the None case from Recursive Method, since now we know our node will never be None.

4. Iteratively with Looping:

We finished the above three implementations in the first hour and we spent an hour for the last implementation! I can't see why we need to change a recursive function to a looping function, since recursion seems simplified our code a lot and it is easy to write than a loooong looping. But I'd like to talk about our thinking for this function here, for seeking the comment of our TAs and looking for yours! :))

We thought of adding item to the stack to store it first, then we can use pop method to get the node that we want to check with a loop. We decided to push self.root first then consider the different conditions for self.root, if self.root is less than item, then there is a chance the right subtree values are also less than item. We can also push the right subtree to the stack to be checked later. No matter self.root is greater than, less than or equal to the item, we need to check the left tree always. Therefore, if node.item < item, we push (node.right) to the stack, then out of the condition we push(node.left) to the stack. Then we start to check, we pop the end of stack. Since we added node.left every time no matter we added node.right or not, therefore before node.left is None, the poped item will always be node.left.
Then our code actually check all the left subtrees first, then after Stack poped out all "node.left"'s, we started to check "node.right"'s. When the Stack is empty, we did our check for all the potential <item nodes.
My partner and I had an argument when we've done our codes. She was thinking that we check all the nodes of the BST, which is not efficient since we have an ordered BST. However, I thought we didn't check all, we only check those Potential nodes that have chance to be less than item. We only push node.right if node.item < item. Node.right always has a value greater than node, so once we find node.item is = or > item, we won't push this Node's right subtree.

We also talked Binary Search Tree's searching algorithm this week, but I feel I didn't really get it. Danny has posted a sorting function for next week's lecture, so I guess we will learn more about sorting next week. So I will just skip it here and added to my next post~

How's everyone's summer plan? My friends and I are thinking of going to Banff. I love Luis Lake, it is gorgeous! :D

CSC148 - Python II

Sunday, 16 March 2014

Week9 - Binary Search Tree