A Star in C

Posted: , Last Updated:

package a_star.c.simple;

import java.util.Arrays;

/**
 * Used to perform the A-Star (A*) Algorithm to find the shortest path from a start to a target node.
 */
public class AStar {
    /**
     * Finds the shortest distance between two nodes using the A-star algorithm
     * @param graph an adjacency-matrix-representation of the graph where (x,y) is the weight of the edge or 0 if there is no edge.
     * @param heuristic an estimation of distance from node x to y that is guaranteed to be lower than the actual distance. E.g. straight-line distance
     * @param start the node to start from.
     * @param goal the node we're searching for
     * @return The shortest distance to the goal node. Can be easily modified to return the path.
     * */
    public static double aStar(int[][] graph, double[][] heuristic, int start, int goal) {

        //This contains the distances from the start node to all other nodes
        int[] distances = new int[graph.length];
        //Initializing with a distance of "Infinity"
        Arrays.fill(distances, Integer.MAX_VALUE);
        //The distance from the start node to itself is of course 0
        distances[start] = 0;

        //This contains the priorities with which to visit the nodes, calculated using the heuristic.
        double[] priorities = new double[graph.length];
        //Initializing with a priority of "Infinity"
        Arrays.fill(priorities, Integer.MAX_VALUE);
        //start node has a priority equal to straight line distance to goal. It will be the first to be expanded.
        priorities[start] = heuristic[start][goal];

        //This contains whether a node was already visited
        boolean[] visited = new boolean[graph.length];

        //While there are nodes left to visit...
        while (true) {

            // ... find the node with the currently lowest priority...
            double lowestPriority = Integer.MAX_VALUE;
            int lowestPriorityIndex = -1;
            for (int i = 0; i < priorities.length; i++) {
                //... by going through all nodes that haven't been visited yet
                if (priorities[i] < lowestPriority && !visited[i]) {
                    lowestPriority = priorities[i];
                    lowestPriorityIndex = i;
                }
            }

            if (lowestPriorityIndex == -1) {
                // There was no node not yet visited --> Node not found
                return -1;
            } else if (lowestPriorityIndex == goal) {
                // Goal node found
                System.out.println("Goal node found!");
                return distances[lowestPriorityIndex];
            }

            System.out.println("Visiting node " + lowestPriorityIndex + " with currently lowest priority of " + lowestPriority);

            //...then, for all neighboring nodes that haven't been visited yet....
            for (int i = 0; i < graph[lowestPriorityIndex].length; i++) {
                if (graph[lowestPriorityIndex][i] != 0 && !visited[i]) {
                    //...if the path over this edge is shorter...
                    if (distances[lowestPriorityIndex] + graph[lowestPriorityIndex][i] < distances[i]) {
                        //...save this path as new shortest path
                        distances[i] = distances[lowestPriorityIndex] + graph[lowestPriorityIndex][i];
                        //...and set the priority with which we should continue with this node
                        priorities[i] = distances[i] + heuristic[i][goal];
                        System.out.println("Updating distance of node " + i + " to " + distances[i] + " and priority to " + priorities[i]);
                    }
                }
            }

            // Lastly, note that we are finished with this node.
            visited[lowestPriorityIndex] = true;
            //System.out.println("Visited nodes: " + Arrays.toString(visited));
            //System.out.println("Currently lowest distances: " + Arrays.toString(distances));

        }
    }
}

About the algorithm and language used in this code snippet:

A Star Algorithm

The A star (A*) algorithm is an algorithm used to solve the shortest path problem in a graph. This means that given a number of nodes and the edges between them as well as the “length” of the edges (referred to as “weight”) and a heuristic (more on that later), the A* algorithm finds the shortest path from the specified start node to all other nodes. Nodes are sometimes referred to as vertices (plural of vertex) - here, we’ll call them nodes.

Description of the Algorithm

The basic principle behind the A star (A*) algorithm is to iteratively look at the node with the currently smallest priority (which is the shortest distance from the start plus the heuristic to the goal) and update all not yet visited neighbors if the path to it via the current node is shorter. This is very similar to the Dijkstra algorithm, with the difference being that the lowest priority node is visited next, rather than the shortest distance node. In essence, Dijkstra uses the distance as the priority, whereas A* uses the distance plus the heuristic.

Why does adding the heuristic make sense? Without it, the algorithm has no idea if its going in the right direction. When manually searching for the shortest path in this example, you probably prioritised paths going to the right over paths going up or down. This is because the goal node is to the right of the start node, so going right is at least generally the correct direction. The heuristic gives the algorithm this spatial information.

So if a node has the currently shortest distance but is generally going in the wrong direction, whereas Dijkstra would have visited that node next, A Star will not. For this to work, the heuristic needs to be admissible, meaning it has to never overestimate the actual cost (i.e. distance) - which is the case for straight line distance in street networks, for example. Intuitively, that way the algorithm never overlooks a shorter path because the priority will always be lower than the real distance (if the current shortest path is A, then if there is any way path B could be shorter it will be explored). One simple heuristic that fulfils this property is straight line distance (e.g. in a street network)

In more detail, this leads to the following Steps:

  1. Initialize the distance to the starting node as 0 and the distances to all other nodes as infinite
  2. Initialize the priority to the starting node as the straight-line distance to the goal and the priorities of all other nodes as infinite
  3. Set all nodes to “unvisited”
  4. While we haven’t visited all nodes and haven’t found the goal node:

    1. Find the node with currently lowest priority (for the first pass, this will be the source node itself)
    2. If it’s the goal node, return its distance
    3. For all nodes next to it that we haven’t visited yet, check if the currently smallest distance to that neighbor is bigger than if we were to go via the current node
    4. If it is, update the smallest distance of that neighbor to be the distance from the source to the current node plus the distance from the current node to that neighbor, and update its priority to be the distance plus its straight-line distance to the goal node

Example of the Algorithm

Consider the following graph: Graph for the A Star (A*) shortest path algorithm

The steps the algorithm performs on this graph if given node 0 as a starting point and node 5 as the goal, in order, are:

  1. Visiting node 0 with currently lowest priority of 8.0
  2. Updating distance of node 1 to 3 and priority to 9.32455532033676
  3. Updating distance of node 2 to 4 and priority to 10.32455532033676
  4. Visiting node 1 with currently lowest priority of 9.32455532033676
  5. Updating distance of node 3 to 9 and priority to 11.82842712474619
  6. Updating distance of node 4 to 13 and priority to 15.82842712474619
  7. Visiting node 2 with currently lowest priority of 10.32455532033676
  8. Visiting node 3 with currently lowest priority of 11.82842712474619
  9. Updating distance of node 5 to 12 and priority to 12.0
  10. Goal node found!

Final lowest distance from node 0 to node 5: 12

Runtime of the Algorithm

The runtime complexity of A Star depends on how it is implemented. If a min-heap is used to determine the next node to visit, and adjacency is implemented using adjacency lists, the runtime is O(|E| + |V|log|V|) (|V| = number of Nodes, |E| = number of Edges). If a we simply search all distances to find the node with the lowest distance in each step, and use a matrix to look up whether two nodes are adjacent, the runtime complexity increases to O(|V|^2).

Note that this is the same as Dijkstra - in practice, though, if we choose a good heuristic, many of the paths can be eliminated before they are explored making for a significant time improvement.

More information on how the heuristic influences the complexity can be found on the Wikipedia Article.

Space of the Algorithm

The space complexity of A Star depends on how it is implemented as well and is equal to the runtime complexity, plus whatever space the heuristic requires.

C

C is a compiled language used for many purposes, although it can be primarily found in systems where importance is important. This is because C offers a lot of low-level support for optimization, at the cost of not having some of the convenient abstractions that other languages offer. C is therefore primarily found in situations where available computation power is low such as embedded systems, or situations where required computation power is high, such as simulation or deep learning.

Getting to “Hello World” in C

The most important things first - here’s how you can run your first line of code in C.

  1. If you’re on Linux or Mac, download and install the latest version of GCC, a C compiler, from gcc.gnu.org. You can also download an earlier version if your use case requires it.
  2. If you’re on Windows, you can also install GCC, even though it might cause problems. You also have other options outlined e.g. here.
  3. Open a terminal, make sure the gcc command is working (or the according command for whichever compiler you’re using), and that the command your’re going to be using is referring to the version you just installed by running gcc --version. If you’re getting a “command not found” error (or similar), try restarting your command line, and, if that doesn’t help, your computer. If the issue persists, here are some helpful forum questions for each platform:

  4. As soon as that’s working, copy the following snippet into a file named HelloWorld.c:

    #include<stdio.h>
    int main() {
    printf("Hello World\n");
    return 0;
    }
  5. Change directory by typing cd path/to/HelloWorld, then run gcc HelloWorld.c to compile the file (which creates the bytecode), then run ./a.out. This should print “Hello World” to your Terminal.

That’s it! People who know multiple programming languages will notice that the entry barrier in C is a little lower than Java even though it is lower-level, while the entry barrier to Python is lower even though it is higher-level. My personal observation is that low-level and high-level languages tend to have low barriers of entry, whereas mid-level languages have higher barriers.

Fundamentals in C

To understand algorithms and technologies implemented in C, one first needs to understand what basic programming concepts look like in this particular language. Each of the following snippets should be compiled and run using the commands mentioned above.

Variables and Arithmetic

Variables in C are statically typed, meaning the content of a variable needs to be specified when writing the code. The datatype for whole numbers, for example is int. Numbers with decimal places are typed float or double depending on the required precision. The type for text ist String.

#include<stdio.h>

int main() {
    int number = 5;
    double decimalNumber = 3.25;
    double result = number * decimalNumber;
    char callout [] = "The number is ";
    // In this instance, the values are concatenated rather than added because one of them is a String.
    printf("%s", callout);
    printf("%f", result);
    printf("\n");
    return 0;
}

Arrays

Arrays in C are real arrays (as opposed to e.g. Python where they’re implemented as lists). The implications of that are that the size needs to be set when they are created and cannot be changed, but also that they are more efficient in C than they are in Python. Also, contrary to Java, C does not check array bounds. If you access an index that doesn’t exist, the program will read whatever is in the memory at that location (which will probably be gibberish).

int integers[5];
integers[3] = 12; // Assigning values to positions in the array
printf("%d\n", integers[0]); // will be 0
printf("%d\n", integers[3]); // will be 12
printf("%d\n", integers[6]); // will print something random that happened to be at that location in memory
return 0;

Conditions

Just like most programming languages, C can do if-else statements. Additionally, C can also do switch-case statements.

int value = 5;
    if(value == 5){
        printf("%s\n", "Value is 5");
    } else if(value < 5){
        printf("%s\n", "Value is less than 5");
    } else {
        printf("%s\n", "Value is something else");
    }
    
    switch (value){
        case 1:
            printf("%s\n", "Value is 1");
            break; // Don't go further down the cases
        case 2:
            printf("%s\n", "Value is 2");
            break; // Don't go further down the cases
        case 3:
            printf("%s\n", "Value is 3");
            break; // Don't go further down the cases
        case 4:
            printf("%s\n", "Value is 4");
            break; // Don't go further down the cases
        case 5:
            printf("%s\n", "Value is 5");
            break; // Don't go further down the cases
        default:
            printf("%s\n", "Value is something else");
    }

The above C code will print “Value is 5” twice.

Loops

C supports for, while as well as do while loops. break and continue statements are also supported. The below example illustrates the differences:

int value = 2;
for (int i = 0; i < value; i++) {
    printf("%d\n", i);
}
while (value > 0) {
    printf("%d\n", value);
    value--;
}
do {
    printf("%d\n", value);
    value--;
} while (value > 0);

This will print the following to the terminal:

0
1
2
1
0

Note the last 0: it is printed because in the do-while-loop, compared to the while-loop. the code block is executed at least once before the condition is checked.

Functions

Functions in C can be declared similar to Java, but require less boilerplate since they don’t need to be part of classes or objects. Here is a minimal example of a function:

#include<stdio.h>

int addNumbers(int numberOne, int numberTwo) {
    return numberOne + numberTwo;
}

int main() {
    printf("%d\n", addNumbers(3, 4));
}

Syntax

C requires the use of curly brackets ({}) to surround code blocks in conditions, loops, functions etc.; It also requires semicolons at then end of statements. While this can lead to some annoying syntax errors, it also means the use of whitespace for preferred formatting (e.g. indentation of code pieces) does not affect the code. Note how the Syntax of C is very similar to Java. The Syntax of Java, and many other languages that came after and/or were derived from C copy many aspects of its Syntax.

Advanced Knowledge of C

C was first released in 1972, is statically typed and was ported to many platforms with various implementations (one of which is GCC which was presented in this article). For more information, C has a great Wikipedia) article.