Iterative Deepening A Star in R

Posted: , Last Updated:

iterative_deepening_a_star <- function(tree, heuristic, start, goal){
  #' Performs the iterative deepening A Star (A*) algorithm to find the shortest path from a start to a target node.
  #' Can be modified to handle graphs by keeping track of already visited nodes.
  #' @param tree      An adjacency-matrix-representation of the tree where (x,y) is the weight of the edge or 0 if there is no edge.
  #' @param heuristic An estimation of distance from node x to y that is guaranteed to be lower than the actual distance. E.g. straight-line distance.
  #' @param start     The node to start from.
  #' @param goal      The node we're searching for.
  #' @return number shortest distance to the goal node. Can be easily modified to return the path.
  
  threshold = heuristic[start,goal]
  repeat{
    cat("Iteration with threshold: ", threshold, "\n")
    distance = iterative_deepening_a_star_rec(tree, heuristic, start, goal, 0, threshold)
    if(distance == Inf){
      # Node not found and no more nodes to visit
      return (-1)
    }else if (distance < 0){
      # if we found the node, the function returns the negative distance
      print("Found the node we're looking for!")
      return (-distance)
    }else{
      # if it hasn't found the node, it returns the (positive) next-bigger threshold
      threshold = distance
    }
  }
}

iterative_deepening_a_star_rec <- function(tree, heuristic, node, goal, distance, threshold){
  #' Performs DFS up to a depth where a threshold is reached (as opposed to interative-deepening DFS which stops at a fixed depth).
  #' Can be modified to handle graphs by keeping track of already visited nodes.
  #' @param tree      An adjacency-matrix-representation of the tree where (x,y) is the weight of the edge or 0 if there is no edge.
  #' @param heuristic An estimation of distance from node x to y that is guaranteed to be lower than the actual distance. E.g. straight-line distance.
  #' @param node      The node to continue from.
  #' @param goal      The node we're searching for.
  #' @param distance  Distance from start node to current node.
  #' @param threshold Until which distance to search in this iteration.
  #' @return number shortest distance to the goal node. Can be easily modified to return the path.
  
  cat("Visiting Node ", node, "\n")
  
  if(node == goal){
    # We have found the goal node we we're searching for
    return (-distance)
  }
  estimate = distance + heuristic[node,goal]
  if(estimate > threshold){
    cat("Breached threshold with heuristic: ", estimate, "\n")
    return (estimate)
  }
  # ...then, for all neighboring nodes....
  min = Inf
  for(i in seq_along(tree[node,])) {
    if(tree[node,i] != 0){
      t = iterative_deepening_a_star_rec(tree, heuristic, i, goal, distance + tree[node,i], threshold)
      if(t < 0){
        # Node found
        return (t)
      } else if  (t < min){
        min = t
      }
    }
  }
  
  return (min)
}

About the algorithm and language used in this code snippet:

Iterative Deepening A Star Algorithm

The Iterative Deepening A Star (IDA*) algorithm is an algorithm used to solve the shortest path problem in a tree, but can be modified to handle graphs (i.e. cycles). It builds on Iterative Deepening Depth-First Search (ID-DFS) by adding an heuristic to explore only relevant nodes.

Description of the Algorithm

Whereas Iterative Deepening DFS uses simple depth to decide when to abort the current iteration and continue with a higher depth, Iterative Deepening A Star uses a heuristic to determine which nodes to explore and at which depth to stop. This is similar to how Dijkstra always explores the node with the currently shortest difference and A Star adds an heuristic to this to only explore nodes that are actually closer to the goal.

In more detail, this leads to the following Steps:

  1. For each child of the current node
  2. If it is the target node, return
  3. If the distance plus the heuristic exceeds the current threshold, return this exceeding threshold
  4. Set the current node to this node and go back to 1.
  5. After having gone through all children, go to the next child of the parent (the next sibling)
  6. After having gone through all children of the start node, increase the threshold to the smallest of the exceeding thresholds.
  7. If we have reached all leaf (bottom) nodes, the goal node doesn’t exist.

Example of the Algorithm

Consider the following graph: Graph for the Iterative Deepening A Star (IDA*) shortest path algorithm

The steps the algorithm performs on this graph if given node 0 as a starting point and node 6 as the goal, in order, are:

  1. Iteration with threshold: 6.32
  2. Visiting Node 0
  3. Visiting Node 1
  4. Breached threshold with heuristic: 8.66
  5. Visiting Node 2
  6. Breached threshold with heuristic: 7.00
  7. Iteration with threshold: 7.00
  8. Visiting Node 0
  9. Visiting Node 1
  10. Breached threshold with heuristic: 8.66
  11. Visiting Node 2
  12. Visiting Node 5
  13. Breached threshold with heuristic: 8.83
  14. Iteration with threshold: 8.66
  15. Visiting Node 0
  16. Visiting Node 1
  17. Visiting Node 3
  18. Breached threshold with heuristic: 12.32
  19. Visiting Node 4
  20. Breached threshold with heuristic: 8.83
  21. Visiting Node 2
  22. Visiting Node 5
  23. Breached threshold with heuristic: 8.83
  24. Iteration with threshold: 8.83
  25. Visiting Node 0
  26. Visiting Node 1
  27. Visiting Node 3
  28. Breached threshold with heuristic: 12.32
  29. Visiting Node 4
  30. Visiting Node 2
  31. Visiting Node 5
  32. Visiting Node 6
  33. Found the node we’re looking for!

Final lowest distance from node 0 to node 6: 9

Notice how the algorithm did not continue to explore down from node 3 in the iteration it found the goal node in. If node 3 would’ve had children, whereas Iterative Deepening DFS would’ve potentially (and needlessly!) explored them, Iterative Deepening A Star did not.

Runtime of the Algorithm

The runtime complexity of Iterative Deepening A Star is in principle the same as Iterative Deepening DFS. In practice, though, if we choose a good heuristic, many of the paths can be eliminated before they are explored making for a significant time improvement. More information on how the heuristic influences the complexity can be found on the Wikipedia Article.

Space of the Algorithm

The space complexity of Iterative Deepening A Star is the amount of storage needed for the tree or graph. O(|N|), |N| = number of Nodes in the tree or graph, which can be replaced with b^d for trees, where b is the branching factor and d is the depth. Additionally, whatever space the heuristic requires.

R

The R Logo

R is an interpreted language first released in 1993 with a significant increase in popularity in recent years. It is primarily used for data mining and -science as well as statistics, and is a popular language in non-computer science disciplines ranging from Biology to Physics. R is dynamically typed, and has one of the widest variety of libraries for statistics, machine learning, data mining etc.

Getting to “Hello World” in R

The most important things first - here’s how you can run your first line of code in R.

  1. Download and install the latest version of R from r-project.org. You can also download an earlier version if your use case requires it.
  2. Open a terminal, make sure the R command is working, and that the command your’re going to be using is referring to the version you just installed by running R --version. If you’re getting a “command not found” error (or similar), try restarting your command line, and, if that doesn’t help, your computer. If the issue persists, here are some helpful StackOverflow questions for Windows, Mac and Linux.
  3. As soon as that’s working, you can run the following snippet: print("Hello World"). You have two options to run this: 3.1 Run R in the command line, just paste the code snippet and press enter (Press CTRL + D and type n followed by enter to exit). 3.2 Save the snippet to a file, name it something ending with .R, e.g. hello_world.R, and run Rscript hello_world.R. Tip: use the ls command (dir in Windows) to figure out which files are in the folder your command line is currently in.

That’s it! Notice how printing something to the console is just a single line in R - this low entry barrier and lack of required boilerplate code is a big part of the appeal of R.

Fundamentals in R

To understand algorithms and technologies implemented in R, one first needs to understand what basic programming concepts look like in this particular language.

Variables and Arithmetic

Variables in R are really simple, no need to declare a datatype or even declare that you’re defining a variable; R knows this implicitly. R is also able to easily define objects and their property, in multiple different ways.

some_value = 10
my_object <- list(my_value = 4)
attr(my_object, 'other_value') <- 3

print((some_value + my_object$my_value + attr(my_object, 'other_value'))) # Prints 17

Arrays

Working with arrays is similarly simple in R:

# Create 2 vectors of length 3
vector1 <- c(1,2,3)
vector2 <- c(4,5,6)

# Create names for rows and columns (optional)
column.names <- c("column_1","column_2","column_3")
row.names <- c("row_1","row_2")

# Concatenate the vectors (as rows) to form an array, providing dimensions and row/column names
result <- array(c(vector1,vector2), dim = c(2,3), dimnames = list(row.names, column.names))

print(result)
# Prints:
#       column_1 column_2 column_3
# row_1        1        3        5
# row_2        2        4        6

As those of you familiar with other programming language like Java might have already noticed, those are not native arrays, but rather lists dressed like arrays. This means that arrays in R are considerably slower than in lower level programming languages. This is a trade off R makes in favor of simplicity. There are, however, packages which implement real arrays that are considerably faster.

Conditions

Just like most programming languages, R can do if-else statements:

value = 1
if(value==1){
   print("Value is 1")
} else if(value==2){
     print("Value is 2")
} else {
     print("Value is something else")
}

R can also do switch statements, although they are implemented as a function, unlike in other languages like Java:

x <- switch(
   1,
   "Value is 1",
   "Value is 2",
   "Value is 3"
)

print(x)

Note that this function is pretty useless, but there are other functions for more complex use cases.

Loops

R supports both for and while loops as well as break and next statements (comparable to continue in other languages). Additionally, R supports repeat-loops, which are comparable to while(true) loops in other languages, but simplify the code a little bit.

value <- 0
repeat {
  value <- value + 1
  if(value > 10) {
    break
  }
}
print(value)

value <- 0
while (value <= 10) {
  value = value + 1
}
print(value)

value <- c("Hello","World","!")
for ( i in value) {
  print(i)
}

for(i in 1:10){
  print(i)
}

Functions

Functions in R are easily defined and, for better or worse, do not require specifying return or arguments types. Optionally, a default for arguments can be specified:

my_func <- function (
  a = "World"
) {
  print(a)
  return("!")
}

my_func("Hello")
print(my_func())

(This will print “Hello”, “World”, and then ”!“)

Syntax

R requires the use of curly brackets ({}) to surround code blocks in conditions, loops, functions etc.; While this can lead to some annoying syntax errors, it also means the use of whitespace for preferred formatting (e.g. indentation of code pieces) does not affect the code.

Advanced Knowledge of R

For more information, R has a great Wikipedia article. The official website is r-project.org.