Friday, August 24, 2012

Tree traversal - the Python recursion

This is the second part of of my tree traversal series. In part I I've went through a classical recursion algorithm, and today I'll rewrite it "in a Python way".

Python recursion solution

Given the same tree structure as in part I:
def traverse_recursive(node, path=list()):
    path.append(node.name)
    yield path
    for n in node.children:
        for path in traverse_recursive(n, path):
            yield path
    if path:
        path.pop()

The key approach is to use python yield statement, which makes this function a generator. Here is in depth guide to generators, but in a nutshell, when python runs function which "yield"s instead of returning, two things happen:

  • When function is called, then none of the function code actually executes, just a generator object is returned
  • Each time generator object is iterated, function code runs until it yields - then yielded value is returned and function execution paused until the next iterator iteration
  • The important thing to note is that all of the function context persists between subsequent iterations

The beauty of this technique is that traversed paths are returned to the caller as soon as they are found. Additionally, if caller wants to pause parsing of tree paths, tree traversal pauses as well. Caller can even delegate results processing to another function at some later stage by simply passing generator object over.

Complexity

Time: O(n) - same as with classic recursion.

Memory: O(log n) + O(log n). First O(log n) for recursion depth, second O(log n) path buffer.

Advantages

  • Low memory footprint
  • Results are available immediately
  • Tree is traversed "as you(caller) go". Traversal can be paused/aborted any time.
Disadvantages
  • Recursion. On very deep trees one may have to change Python's max execution depth