CRAN reports UB in your R-package: How to fix it?

CRAN automatically checks C/C++ source code packages for Undefined Behaviour (UB). If the automated checks find UB it might result in the removal of the package from CRAN. This is a good thing, the presence of UB is a potential source of bugs and has no positive side effects. These checks have been improving and found scrypt package actually had some UB which needed to be fixed. This scared me a bit since although I’m a maintainer since the package contains some relatively complicated code which I didn’t write or change. A bit of history: Colin Percival conceived and implemented the algorithm in C and Andrew Kipp ported his code to R. I only revived the package after improved automated checks found problems that needed to be solved.

Reproducing the issue was my first challenge. My machine runs Ubuntu LTS and doesn’t have all the fancy tools installed to detect the UB. If I’m not able to reproduce the issue I can’t know whether I fixed it either so reproducing seemed like a good first step. I asked on Twitter and Dirk Eddelbuettel pointed me to the repo of a Docker image purpose made for this kind of analysis by Winston Chang. The instructions are worth reading but this is all I needed:

docker run --rm -ti --security-opt seccomp=unconfined -v $(pwd):/rscrypt wch1/r-debug

This downloads the docker image if you don’t have it yet and starts it with the current directory mounted to /rscrypt inside the container. Within the container you can access versions of R with extra instrumentation. For detecting the UB two versions are available: RDsan (san is shorthand for sanitizer) which is compiled with gcc and RDcsan which is compiled with clang. So now I could do (some output elided):

R> RDcsan CMD INSTALL scrypt_0.1.3.tar.gz  # The last version with the problem
R> scrypt::hashPassword("password")
scrypt-1.1.6/lib/crypto/sha256.c:254:24: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:44:28: note: nonnull attribute specified here
    #0 0x7f21fe3bde7f in scrypt_SHA256_Update /tmp/.../sha256.c:254:3
    #1 0x7f21fe42b1e6 in scrypt_HMAC_SHA256_Update /tmp/.../sha256.c:335:2
    #2 0x7f21fe42b6a1 in PBKDF2_SHA256 /tmp/../sha256.c:377:2
    #3 0x7f21fe42c9e6 in crypto_scrypt /tmp/.../crypto_scrypt-ref.c:258:2
    #4 0x7f21fe46f822 in getcpuperf(double*) /tmp/.../util.cpp:145:13
    #5 0x7f21fe46420c in (anonymous namespace)::getparams(double, double, int*, unsigned int*, unsigned int*) /tmp/.../scrypt.cpp:49:15
    #6 0x7f21fe462f81 in hashPassword(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, double, double) /tmp/.../scrypt.cpp:129:15
    #7 0x7f21fe439c72 in _scrypt_hashPassword /tmp/.../RcppExports.cpp:17:34
...
etc

The C code of the scrypt package is entered at #7.

The CRAN error report shows that UB was with R compiled by using both gcc and clang but for some reason when I tried the issue was only flagged by RDcsan. The important thing is the error can be reproduced and a possible fix can be validated.

In this particular case the error was easy to find. The code that triggers the UB is the following memcpy() and from the error message we know that the src argument is the culprit:

memcpy(&ctx->buf[r], src, len);

Tracing through the call stack I found that the src in this case is the salt used when figuring out how many iterations scrypt should perform to perform reasonably well on the current machine. More importantly, for this test the hash has the value NULL. This explains the error found: memcpy() can’t copy anything from NULL.

With the root cause identified, the solution is simple: set the initial salt to an actual value, since this code is only called when checking the CPU performance and since the old behaviour is UB anyway it doesn’t really make a difference what value is used. I choose to set it to all zeroes. After that, the UB disappeared, a commit was pushed and a submission to CRAN was accepted.

Lessons learned

  1. CRAN has been checking packages for UB since 2014 and the checks are still improving, great!
  2. Finding UB like this requires help from good tooling which isn’t as easily available as R and CRAN
  3. Luckily, smart people have figured out a way to get this tooling on your computer in a few minutes
  4. Once I knew were to look, the problem was easily solved

FastR sudoku’s

A few days ago @JozefHajnala pointed me to Oracle’s FastR which is an implementation of R on the Java Virtual Machine (GraalVM). This implementation aims to be completely compatible with the GNU R implementation we all know and love.

In September 2018 I gave a talk (GitHub, presentation) on using Rcpp for solving sudoku’s at Amsterdam SatRday. Point of the talk was to show the performance benefits that can be achieved using the Rcpp-package. This post describes my adventure of solving sudoku’s using FastR and compares performance between the Rcpp solver and the plain R sudoku solver ran using FastR.

Installation of FastR

The easiest way to get started is to obtain the GraalVM community edition from the GitHub releases page. Download the graalvm-ce tar.gz relevant for your platform and untar it and use the gu-tool to install and start R:

tar xzvf graal-ce-*-19.0.0.tar.gz
cd graal-ce-19.0.0/bin
gu install R
./R

If all went well, you will now see a somewhat familiar sight:

R version 3.5.1 (FastR)
Copyright (c) 2013-19, Oracle and/or its affiliates
Copyright (c) 1995-2018, The R Core Team
Copyright (c) 2018 The R Foundation for Statistical Computing
Copyright (c) 2012-4 Purdue University
Copyright (c) 1997-2002, Makoto Matsumoto and Takuji Nishimura
All rights reserved.

FastR is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information.

Type 'q()' to quit R.

A quick test drive

Installing packages is as easy as with GNU R so doing the first test doesn’t take long:

install.packages('microbenchmark')
sqr <- function(x) x * x microbenchmark::microbenchmark(sqr(runif(1000))) 
Unit: microseconds
            expr min  lq mean median  uq max neval
sqr(runif(1000)) 145 154  179    165 178 653   100 

To be fair, this is not very promising. Running the same snippet in GNU 3.4.4 gives much better performance:

sqr <- function(x) x * x
microbenchmark::microbenchmark(sqr(runif(1000)))
Unit: microseconds
expr min lq mean median uq max neval
sqr(runif(1000)) 24 25 152 25 25 11919 100

Hmm, appears to me they should have called it SlowR instead. But before we dish out our final judgement let’s tweak the benchmark a bit (sqr.R in this gist):

library(microbenchmark)                                                                                                                                                                      
sqr <- function(x) x * x
f1 <- function() for (n in 1:1000) sqr(runif(1:n))
print(R.version)
print(microbenchmark(f1(), control = list(warmup = 100L)))

Running this snippet in both R’s motivated me to continue my research and actually made me excited to see what FastR has to offer:

            
> source('~/code/FastR/sqr.R')
... language R
version.string R version 3.4.4 (2018-03-15) Unit: milliseconds expr min lq mean median uq max neval f1() 16.3209 17.15926 18.16575 17.4746 17.73782 51.27648 100
> source('~/code/FastR/sqr.R')
... language R engine FastR version.string FastR version 3.5.1 (2018-07-02) Unit: milliseconds expr min lq mean median uq max neval f1() 6.053445 6.374638 8.226994 6.765636 6.99983 30.11883 100

Solving sudoku’s

The sudoku below has been claimed to be the world’s hardest sudoku:

sudokuTxt <- "
8 0 0 0 0 0 0 0 0
0 0 3 6 0 0 0 0 0
0 7 0 0 9 0 2 0 0
0 5 0 0 0 7 0 0 0
0 0 0 0 4 5 7 0 0
0 0 0 1 0 0 0 3 0
0 0 1 0 0 0 0 6 8
0 0 8 5 0 0 0 1 0
0 9 0 0 0 0 4 0 0"

On my machine, the R/Rcpp solver finds a solution within a second. The source can be found in the gist as sudoku.cpp and sudoku.R. Microbenchmarking this solution in GNU R gives

R> source('sudoku.R')
R> microbenchmark(cpp = solve2(sudoku, findChoicesCpp))
Unit: milliseconds
expr min lq mean median uq max neval
cpp 721.5412 735.6565 756.2961 748.2816 773.9773 835.9321 100

while plain R run using the FastR-engine is about two and a half times as fast:

$> ./R --vm.Xss5000k
R> source('plain_R.R')
R> microbenchmark::microbenchmark(
solve2(sudoku, findChoices2), control = list(warmup = 20L)
)
Unit: milliseconds
expr min lq mean median uq max neval
solve2(sudoku, findChoices2) 257 260 293 266 271 1217 100

Caveats

Note in the example above the command line parameter to increase the stack size. If the stack size is not increased like that the function will crash.

Another important thing to note is that FastR will use more cores and will need more memory to run than GNU R.

Conclusion

Oracle FastR is an interesting project that can improve the runtime of R calculations by quite a bit and I will be watching its development closely. Resources use can be a problem, after running the benchmark, almost 4GB of RAM is used against less than 500mb for the process running the Rcpp code.