Using R and C++ together


I recently developed the R package rivr for teaching open-channel hydraulics. Solving unsteady open-channel flow problems is often computationally expensive, and an interpreted language like R is usually not the right choice for this sort of program. However, the Rcpp package makes it really easy to embed C++ in your package. This means you can get the best of both worlds by seamlessly linking compiled code into your R package. There are some pretty great references for how and when you should use C++ in R, but I think the best way to include it in packages is via Rcpp attributes—especially since it plays nice with devtools and roxygen2.

Programming with the C++ library provided by Rcpp looks a lot like programming in R. Consider the following function, first implemented in R and then equivalently in C++:

# this is a silly function written in R

my_r_func = function(a, b){
  res = a ;
  if(a < b){
    for(i in  a:b){ 
      res = res*b ;
    }
  } else {
    for(i in b:a){
      res = res*a
    }
  }
  return(res)
}
// this is a silly function written in C++

int my_cpp_func = function(int a, int b){
  int res = a ;
  if(a < b){
    for(int i = a; i < b ; i++){ 
      res *= b ;
    }
  } else {
    for(int i = b; i < a; i++){
      res *= a ;
    }
  }
  return(res) ;
}

They look basically the same! The most obvious differences are that (1) C++ functions and variables need to have their types explicitly defined, (2) loops are defined slightly differently, (3) C++ lines need to end with a semicolon, (4) you can use shortcuts like a *= b instead of writing a = a*b, and (5) C++ comments use // instead of #. Otherwise a lot of things are the same; if and while statements have similar syntax, you use curly braces {} to group logic blocks, and you use return and break statements to exit out of functions and loops.

There are a few other important differences that can trip you up though, and some of them are subtle. I’ve made a short list of things for R programmers to keep in mind when writing C++ code with Rcpp.

Vector and matrix indices start at 0, rather than 1.

Consider the following function, first written in R and then in C++.

r_select_element = function(s, i){
  return(s[i])
}
double cpp_select_element(NumericVector s, int i){
  return(s[i]);
}
# after loading C++ function in R

myvector = seq(5)
r_select_element(myvector, 1)    # will return first element

cpp_select_element(myvector, 0)  # will return first element

cpp_select_element(myvector, 1)  # will return second element


r_select_element(myvector, 5)    # will return last element

cpp_select_element(myvector, 4)  # will return last element

cpp_select_element(myvector, 5)  # will produce error

Use parentheses to refer to matrix elements.

Indexing a vector in C++ uses square brackets [] (like R), but for matrices you must use parentheses (). To select an entire row or column of a matrix, use _ in C++ .

# indexing in R

A = matrix(rep(0, 12), nrow = 3) # create 3x4 matrix filled with zeros

x = A[1,1]                       # select element in first row and column

y = A[1,]                        # select first row

z = y[1]                         # select first element
// indexing in C++

NumericMatrix A(3, 4, 0);        // create 3x4 matrix filled with zeros

double x = A(1, 0)               // select element in first row and column

double x = A[0, 0]               // will fail

NumericVector y = A(0,_)         // select first row

double z = y[0]                  // select first element

double z = y(0)                  // will fail

double quotes "" are type string, but single quotes '' are type char.

This one took me an embarrassingly long time to figure out. You can’t use logical operators to directly compare a string to a char, so you need to be consistent on both sides of the equality.

# string comparision in R

mystring = "foobar"
mystring[5] == "a"               # will return TRUE

mystring[5] == 'a'               # same as above
// string comparision in C++

std::string mystring = "foo"
mystring.at(4) == 'a'            // will return TRUE

mystring.at(4) == "a"            // will fail

You can’t store a function in a variable in C++, but you can use pointers to pass functions.

Pointers allow you to point to specific locations in the program memory, which means you can induce side-effects like change the underlying value of a variable or reference a function. You can do a lot with pointers, but they can also get you in trouble.

# functions as variables in R

my_add = function(a, b){                 # create function

  return(a + b)
}
my_r_func = my_add                       # store function in variable

my_r_func(1, 2)                          # call function
// using pointers to reference functions in C++

double cpp_my_add(double a, double b){   // create function

  return(a + b);
}
double (*cpp_myfunc) (double, double);   // create pointer by prepending with *

cpp_myfunc = &c_my_add;                  // point to existing function using &

cpp_myfunc(1, 2);                        // call function

Pointing to variables has slightly different syntax. I can’t really think of an equivalent operation in R!

NumericVector myvec(10, 0);              //initialize a NumericVector of zeroes

NumericVector *pointvec;                 // create pointer by prepending with *

pointvec = &myvec                        // point to variable using &

*pointvec[0] = 5;                        // assign to first element via pointer


// you can also point to individual elements

double *pointelement;
pointelement = &myvec[0];
*pointelement = 5;                       // same as above 

Those were the main pitfalls I ran into as an R programmer learning C++. Hopefully these tips and references will let you hit the ground running.


Comments