DFS in Python

Posted: , Last Updated:

def dfs(start, target):
    """
    Implementation of DFS (depth-first search) algorithm to find the shortest path from a start to a target node..
    Given a start node, this returns the node in the tree below the start node with the target value (or null if it doesn't exist)
    Runs in O(n), where n is the number of nodes in the tree, or O(b^d), where b is the branching factor and d is the depth.
    :param start:  the node to start the search from
    :param target: the value to search for
    :return: The node containing the target value or null if it doesn't exist.
    """
    print("Visiting Node " + str(start["value"]))
    if start["value"] == target:
        # We have found the goal node we we're searching for
        print("Found the node we're looking for!")
        return start

    # Recurse with all children
    for i in range(len(start["children"])):
        result = dfs(start["children"][i], target)
        if result is not None:
            # We've found the goal node while going down that child
            return result

    # We've gone through all children and not found the goal node
    print("Went through all children of " + str(start["value"]) + ", returning to it's parent.")
    return None

About the algorithm and language used in this code snippet:

Depth-First Search Algorithm

The Depth-First Search (also DFS) algorithm is an algorithm used to find a node in a tree. This means that given a tree data structure, the algorithm will return the first node in this tree that matches the specified condition (i.e. being equal to a value). Nodes are sometimes referred to as vertices (plural of vertex) - here, we’ll call them nodes. The edges have to be unweighted. This algorithm can also work with unweighted graphs if a mechanism to keep track of already visited nodes is added.

Description of the Algorithm

The basic principle of the algorithm is to start with a start node, and then look at the first child of this node. It then looks at the first child of that node (grandchild of the start node) and so on, until a node has no more children (we’ve reached a leaf node). It then goes up one level, and looks at the next child. If there are no more children, it goes up one more level, and so on, until it find more children or reaches the start node. If hasn’t found the goal node after returning from the last child of the start node, the goal node cannot be found, since by then all nodes have been traversed.

Specifically, these are the steps:

  1. For each child of the current node
  2. If it is the target node, return. The node has been found.
  3. Set the current node to this node and go back to 1.
  4. If there are no more child nodes to visit, return to the parent.
  5. If the node has no parent (i.e. it is the root), return. The node has not been found.

Example of the Algorithm

Consider the following tree: Tree for the Depth-First Search algorithm

The steps the algorithm performs on this tree if given node 0 as a starting point, in order, are:

  1. Visiting Node 0
  2. Visiting Node 1
  3. Visiting Node 3
  4. Went through all children of 3, returning to it’s parent.
  5. Visiting Node 4
  6. Went through all children of 4, returning to it’s parent.
  7. Went through all children of 1, returning to it’s parent.
  8. Visiting Node 2
  9. Visiting Node 5
  10. Went through all children of 5, returning to it’s parent.
  11. Visiting Node 6
  12. Found the node we’re looking for!

Runtime of the Algorithm

The runtime of regular Depth-First Search (DFS) is O(|N|) (|N| = number of Nodes in the tree), since every node is traversed at most once. The number of nodes is equal to b^d, where b is the branching factor and d is the depth, so the runtime can be rewritten as O(b^d).

Space of the Algorithm

The space complexity of Depth-First Search (DFS) is, if we exclude the tree itself, O(d), with d being the depth, which is also the size of the call stack at maximum depth. If we include the tree, the space complexity is the same as the runtime complexity, as each node needs to be saved.

Python

The Python Logo

Python™ is an interpreted language used for many purposes ranging from embedded programming to web development, with one of the largest use cases being data science.

Getting to “Hello World” in Python

The most important things first - here’s how you can run your first line of code in Python.

  1. Download and install the latest version of Python from python.org. You can also download an earlier version if your use case requires it - many technologies still require it due to the breaking changes introduced with Python 3.
  2. Open a terminal, make sure the python or python3 command is working, and that the command your’re going to be using is referring to the version you just installed by running python3 --version or python --version. If you’re getting a “command not found” error (or similar), try restarting your command line, and, if that doesn’t help, your computer. If the issue persists, here are some helpful StackOverflow questions for Windows, Mac and Linux.
  3. As soon as that’s working, you can run the following snippet: print("Hello World"). You have two options to run this: 3.1 Run python in the command line, just paste the code snippet and press enter (Press CTRL + D or write exit() and press enter to exit). 3.2 Save the snippet to a file, name it something ending with .py, e.g. hello_world.py, and run python path/to/hello_world.py. Tip: use the ls command (dir in Windows) to figure out which files are in the folder your command line is currently in.

That’s it! Notice how printing something to the console is just a single line in Python - this low entry barrier and lack of required boilerplate code is a big part of the appeal of Python.

Fundamentals in Python

To understand algorithms and technologies implemented in Python, one first needs to understand what basic programming concepts look like in this particular language.

Variables and Arithmetic

Variables in Python are really simple, no need to declare a datatype or even declare that you’re defining a variable; Python knows this implicitly.

a = 1
b = {'c':2}

print(a + b['c']) # prints 3

Arrays

Working with arrays is similarly simple in Python:

arr = ["Hello", "World"]

print(arr[0]) # Hello
print(arr[1]) # World
# print(arr[2]) # IndexError

arr.append("!")

print(arr[2]) # !

As those of you familiar with other programming language like Java might have already noticed, those are not native arrays, but rather lists dressed like arrays. This is evident by the fact that no size needs to be specified, and elements can be appended at will. In fact, print(type(arr)) prints <class 'list'>. This means that arrays in Python are considerably slower than in lower level programming languages. There are, however, packages like numpy which implement real arrays that are considerably faster.

Conditions

Just like most programming languages, Python can do if-else statements:

value = 1
if value==1:
    print("Value is 1")
elif value==2:
    print("Value is 2")
else:
    print("Value is something else")

Python does however not have case-statements that other languages like Java have. In my opinion, this can be excused by the simplicity of the if-statements which make the “syntactic sugar” of case-statements obsolete.

Loops

Python supports both for and while loops as well as break and continue statements. While it does not have do-while loops, it does have a number of built-in functions that make make looping very convenient, like ‘enumerate’ or range. Here are some examples:

value = 10
while value > 0:
    print(value)
    value -= 1

for index, character in enumerate("banana"):
    print("The %d-th letter is a %s" % (index + 1, character))

Note that Python does not share the common iterator-variable syntax of other languages (e.g. for(int i = 0; i < arr.length; i++) in Java) - for this, the enumerate function can be used.

Functions

Functions in Python are easily defined and, for better or worse, do not require specifying return or arguments types. Optionally, a default for arguments can be specified:

def print_something(something="Hello World"):
    print(something)
    return "Success"

print_something()
print(print_something("banana"))

(This will print “Hello World”, “Banana”, and then “Success”)

Syntax

As you might have noticed, Python does not use curly brackets ({}) to surround code blocks in conditions, loops, functions etc.; This is because Python depends on indentation (whitespace) as part of its syntax. Whereas you can add and delete any amount of whitespace (spaces, tabs, newlines) in Java without changing the program, this will break the Syntax in Python. This also means that semicolons are not required, which is a common syntax error in other languages.

Advanced Knowledge of Python

Python was first released in 1990 and is multi-paradigm, meaning while it is primarily imperative and functional, it also has object-oriented and reflective elements. It’s dynamically typed, but has started offering syntax for gradual typing since version 3.5. For more information, Python has a great Wikipedia article.