Greatest Common Divisor in R

Posted: , Last Updated:

gcd <- function (a,b) {
  #' Recursive implementation to find the gcd (greatest common divisor) of two integers using the euclidean algorithm.
  #' For more than two numbers, e.g. three, you can box it like this: gcd(a,gcd(b,greatest_common_divisor.c)) etc.
  #' This runs in O(log(n)) where n is the maximum of a and b.
  #' @param a the first integer
  #' @param b the second integer
  #' @return the greatest common divisor (gcd) of the two integers.
  print(sprintf("New *a* is %s, new *b* is %s",a,b))
  if(b == 0){
    print(sprintf("b is 0, stopping recursion, a is the gcd: %s", a))
    return (a)
  }
  print(sprintf("Recursing with new a = b and new b = a %% b..."))
  gcd(b, a %% b)
}

print(gcd(10,20))

About the algorithm and language used in this code snippet:

Euclidean Greatest Common Divisor (GCD) Algorithm

The greatest common divisor of two numbers (in this case a and b) is the biggest number which both numbers can be divided by without a rest. This greatest common divisor algorithm, called the euclidean algorithm, determines this number. The greatest common divisor is also often abbreviated as gcd.

Description of the Algorithm

The basic principle behind thus gcd algorithm is to recursively determine the gcd of a and b by determining the gcd of b and a % b This hinges on the fact that the gcd of two numbers also divides their difference, e.g. the greatest common divisor of 16 and 24 (which is 8) is also the greatest common divisor of 24-16=8. This is therefore also true for 16 and 40 - in fact, rather than taking the difference, the remainder can also be used (repeatedly recursing on the difference will inevitable “pass” the remainder). In summary, the euclidean gcd algorithm uses these 2 steps:

  1. if a or b is 0, return the other one.
  2. Repeat with the new a as b and the new b as a % b.

Using the remainder has a much faster runtime compared to using the difference.

Example of the Algorithm

When computing the gcd of 1071 and 462, the following steps will be taken:

  1. a is 1071, new b is 462
  2. Recursing with new a = b and new b = a % b…
  3. New a is 462, new b is 147
  4. Recursing with new a = b and new b = a % b…
  5. New a is 147, new b is 21
  6. Recursing with new a = b and new b = a % b…
  7. New a is 21, new b is 0
  8. b is 0, stopping recursion, a is the gcd: 21

Runtime Complexity of the Algorithm

The runtime complexity of the Euclidean greatest common divisor algorithm is O(log(max(a,b))) (the logarithm of the maximum of the two numbers). Using the remainder rather than the difference is considerably faster - if the difference would’ve been used this greatest common divisor algorithm would’ve had a runtime of O(max(a,b))

Space Complexity of the Algorithm

The space complexity of the Euclidean greatest common divisor algorithm is equal to the runtime, since every recursive call is saved in the stack and everything else is constant.

R

The R Logo

R is an interpreted language first released in 1993 with a significant increase in popularity in recent years. It is primarily used for data mining and -science as well as statistics, and is a popular language in non-computer science disciplines ranging from Biology to Physics. R is dynamically typed, and has one of the widest variety of libraries for statistics, machine learning, data mining etc.

Getting to “Hello World” in R

The most important things first - here’s how you can run your first line of code in R.

  1. Download and install the latest version of R from r-project.org. You can also download an earlier version if your use case requires it.
  2. Open a terminal, make sure the R command is working, and that the command your’re going to be using is referring to the version you just installed by running R --version. If you’re getting a “command not found” error (or similar), try restarting your command line, and, if that doesn’t help, your computer. If the issue persists, here are some helpful StackOverflow questions for Windows, Mac and Linux.
  3. As soon as that’s working, you can run the following snippet: print("Hello World"). You have two options to run this: 3.1 Run R in the command line, just paste the code snippet and press enter (Press CTRL + D and type n followed by enter to exit). 3.2 Save the snippet to a file, name it something ending with .R, e.g. hello_world.R, and run Rscript hello_world.R. Tip: use the ls command (dir in Windows) to figure out which files are in the folder your command line is currently in.

That’s it! Notice how printing something to the console is just a single line in R - this low entry barrier and lack of required boilerplate code is a big part of the appeal of R.

Fundamentals in R

To understand algorithms and technologies implemented in R, one first needs to understand what basic programming concepts look like in this particular language.

Variables and Arithmetic

Variables in R are really simple, no need to declare a datatype or even declare that you’re defining a variable; R knows this implicitly. R is also able to easily define objects and their property, in multiple different ways.

some_value = 10
my_object <- list(my_value = 4)
attr(my_object, 'other_value') <- 3

print((some_value + my_object$my_value + attr(my_object, 'other_value'))) # Prints 17

Arrays

Working with arrays is similarly simple in R:

# Create 2 vectors of length 3
vector1 <- c(1,2,3)
vector2 <- c(4,5,6)

# Create names for rows and columns (optional)
column.names <- c("column_1","column_2","column_3")
row.names <- c("row_1","row_2")

# Concatenate the vectors (as rows) to form an array, providing dimensions and row/column names
result <- array(c(vector1,vector2), dim = c(2,3), dimnames = list(row.names, column.names))

print(result)
# Prints:
#       column_1 column_2 column_3
# row_1        1        3        5
# row_2        2        4        6

As those of you familiar with other programming language like Java might have already noticed, those are not native arrays, but rather lists dressed like arrays. This means that arrays in R are considerably slower than in lower level programming languages. This is a trade off R makes in favor of simplicity. There are, however, packages which implement real arrays that are considerably faster.

Conditions

Just like most programming languages, R can do if-else statements:

value = 1
if(value==1){
   print("Value is 1")
} else if(value==2){
     print("Value is 2")
} else {
     print("Value is something else")
}

R can also do switch statements, although they are implemented as a function, unlike in other languages like Java:

x <- switch(
   1,
   "Value is 1",
   "Value is 2",
   "Value is 3"
)

print(x)

Note that this function is pretty useless, but there are other functions for more complex use cases.

Loops

R supports both for and while loops as well as break and next statements (comparable to continue in other languages). Additionally, R supports repeat-loops, which are comparable to while(true) loops in other languages, but simplify the code a little bit.

value <- 0
repeat {
  value <- value + 1
  if(value > 10) {
    break
  }
}
print(value)

value <- 0
while (value <= 10) {
  value = value + 1
}
print(value)

value <- c("Hello","World","!")
for ( i in value) {
  print(i)
}

for(i in 1:10){
  print(i)
}

Functions

Functions in R are easily defined and, for better or worse, do not require specifying return or arguments types. Optionally, a default for arguments can be specified:

my_func <- function (
  a = "World"
) {
  print(a)
  return("!")
}

my_func("Hello")
print(my_func())

(This will print “Hello”, “World”, and then ”!“)

Syntax

R requires the use of curly brackets ({}) to surround code blocks in conditions, loops, functions etc.; While this can lead to some annoying syntax errors, it also means the use of whitespace for preferred formatting (e.g. indentation of code pieces) does not affect the code.

Advanced Knowledge of R

For more information, R has a great Wikipedia article. The official website is r-project.org.