High precision arithmetic

High precision arithmetic is available in package Rmpfr (see also Rmpfr at R-Forge), which provides an interface to MPFR Library and GMP Library. An external high precision decimal calculator, which may be accessed from R, is Berkeley calculator (bc command on Unix systems). High precision and rational arithmetic is also available using Ryacas, which provides and interface to a runnig Yacas version 1.0.63. An example of the use of Ryacas is described below and more information may be found in computer algebra systems and R.

Package Rmpfr requires that the libraries MPFR and GMP are installed on the system. However, on the contrary to the interfaces to Yacas and BC below, when using Rmpfr, the computation is done within the process running R and no other processes are needed.

Rmpfr package

The determinant of the matrix

a <- rbind(
c(703513661, 734349672),
c(637552115, 665496937))

is 77, however, in double precision, we obtain

det(a) # [1] 83.86536
a[1, 1] * a[2, 2] - a[2, 1] * a[1, 2] # [1] 64

The exact value may be obtained using Rmpfr

b <- array(mpfr(a, precBits=100), dim=dim(a))
b[1, 1] * b[2, 2] - b[2, 1] * b[1, 2]
# 1 'mpfr' number of precision  100   bits
# [1] 77

Another example of the use of the package is as follows

#  ...
x <- mpfr("6e-10", precBits=200) # approximately 60 decimal digits
log(1 + x)
#  1 'mpfr' number of precision  200   bits 
#  [1] 5.9999999982000000007199999996760000001555199999222418595962767e-10
x - x^2/2 + x^3/3 - x^4/4 + x^5/5
#  1 'mpfr' number of precision  200   bits 
#  [1] 5.9999999982000000007199999996760000001555200000000000000000022e-10
x - x^2/2 + x^3/3
#  1 'mpfr' number of precision  200   bits 
#  [1] 5.9999999982000000007200000000000000000000000000000000000000046e-10

Note that the accuracy of Rmpfr numbers allows to compare the accuracy of the approximation of log(1 + 6e-10) using three and five terms of the Taylor expansion.

An example demonstrating matrix computation with Rmpfr numbers follows. The function vandermonde() creates a Vandermonde matrix for the given vector of values. The function inv.vandermonde() computes the inverse of the Vandermonde matrix using the fact that the columns of the inverse are formed by the coefficients of the Lagrange polynomials for the given points. The product of these two matrices is then compared to the indentity matrix and the maximum absolute difference over the elements of the matrix is printed.

The functions vandermonde() and inv.vandermonde() are written so that they may be used either with double precision or with Rmpfr numbers.

vandermonde <- function(a)
    n <- length(a)
    A <- array(a, dim=c(n, n))
    A^(col(A) - 1)
inv.vandermonde <- function(a)
    n <- length(a)
    B <- array(a - a, dim=c(n, n)) # zeros with the same accuracy
    B[1, ] <- 1
    num <- B[1, ]
    for (i in 1:n)
        C <-  - a[i] * B
        C[2:n, ] <- C[2:n, ] + B[1:(n-1), ]
        B[, - i] <- C[, - i]
        p <- a - a[i]
        p[i] <- 1
        num <- num * p
    B / array(rep(num, each=n), dim=c(n, n))
n <- 22
a <- 0:(n-1)
b <- mpfr(a, precBits=100)
err.double.solve <- try( vandermonde(a) %*% solve(vandermonde(a)) - diag(n) )
err.double <- vandermonde(a) %*% inv.vandermonde(a) - diag(n)
err.mpfr   <- vandermonde(b) %*% inv.vandermonde(b) - diag(n)
if (is.numeric(err.double.solve))
    err.double.solve <- max(abs(err.double.solve))
} else
    err.double.solve <- "system is computationally singular"
cat("1. Error using solve() in double precision:", err.double.solve, "\n\n")
cat("2. Error using inv.vandermonde() in double precision:", max(abs(err.double)), "\n\n")
cat("3. Error using inv.vandermonde() with Rmpfr numbers:\n\n")

The output of this script is

Error in solve.default(vandermonde(a)) : 
  system is computationally singular: reciprocal condition number = 3.81973e-34
1. Error using solve() in double precision: system is computationally singular 
2. Error using inv.vandermonde() in double precision: 8.1875 
3. Error using inv.vandermonde() with Rmpfr numbers:
1 'mpfr' number of precision  100   bits 
[1] 5.4123372450476381345652043819427e-14

The output demonstrates that the computation using double precision produces high error. Function solve() determines this situation and reports the system to be numerically singular. The function inv.vandermonde() does not detect this situation. However, if we use Rmpfr numbers, the result achieves a good accuracy.

Ryacas package

For a correct use of Ryacas, in particular on Unix platforms, it is necessary to install it according to the instructions on the Ryacas home page http://Ryacas.googlecode.com.

In the following example of using Ryacas, Sym() converts its argument to “Sym” class so that from that point onward the system interprets it as an expression to be evaluated by Yacas rather than R. Note that we use R syntax such as ^ even though Yacas itself uses ** for the same operation. The Sym interface is one of 8 interfaces between R and Yacas provided by the Ryacas package. The return value is of class “yacas” and the print.yacas method displays it as an expression here (although other forms are possible). The function N(expression, precision) forces the expression to be evaluated with the given precision.

library(Ryacas) # Loading required package: XML
N(log(1 + 4/Sym(10)^30), 100) # expression(4e-30)
Sym("100000000000000000000000001/100000000000000000000000000") - 1 # expression(1/1e+26)

Berkeley calculator

Berkeley calculator is an arbitrary precision numeric processing language that normally comes with UNIX and is also freely available for many other platforms (see, e.g., GNU bc). For Windows, get bc from http://sourceforge.net/projects/unxutils or http://gnuwin32.sourceforge.net/packages/bc.htm and then place bc.exe (and in the second case also readline4.dll) on your PATH.

Here is an example of how bc can be accessed from within R to perform computations with accuracy that exceeds the representational capacity of R’s numeric types.

 > source("http://r-bc.googlecode.com/svn/trunk/R/bc.R") # R/bc interface
 > zero <- bc(0)   # makes zero of class "bc"
 > # 1. any number that's combined with a "bc" becomes a "bc"
 > (zero + 1.1) ^ 100
[1] "13780.6123398222701841183371720896367762643312000384664331464775521549852095523076769401159497458526446001"
 > result <- .Last.value
 > # convert that to R
 > print(as.numeric(result), digits = 22)
[1] 13780.61233982227
 > # compare to the computation done in R
 >     print(1.1^100, digits=22)
[1] 13780.61233982238
 > # 2. another example of bc
 > result + (zero + 1.05) ^ 50
[1] "13792.0797396080238601531885856408478278561280565279682973044876640465221180539286052313473435599395586626"
 > # compare with the same computation done in R
 >     print(1.1^100 + 1.05^50, digits=22)
[1] 13792.07973960813
 > # 3. a third example
 > log(1+exp(zero-40))
[1] ".0000000000000000042483542552915889863049778436315821818777506798229745422054513754587839239746028813"
 > # compare to the same computation done in R
 > log(1+exp(-40))
[1] 0

The Berkeley calculator uses decimal arithmetic.

The following examples demonstrate the difference between the decimal arithmetic in bc and the machine binary arithmetic even in situations, when a large precision is not used.

The following example is adapted from The R Inferno by Patrick Burns, p. 2:

    bc('0.1 == 0.3/3', logical=TRUE)
    # TRUE
    0.1 == 0.3/3
    # FALSE

The following example is adapted from R FAQ 7.31 Why doesn't R think these numbers are equal?:

    bc('10*0.1 == 1', logical=TRUE)
    # TRUE
    10*0.1 == 1
    # FALSE

Comparison of Ryacas and the above interface to Berkeley calculator

The interface to Berkeley calculator starts bc for evaluating each expression. On the other hand, Ryacas communicates with a running Yacas. The main advantages of the Ryacas approach are:

  • it is possible to leave a value of the result of a calculation in yacas and then on subsequent calls to yacas continue to compute with it in full precision since state is maintained between calls.
  • yacas includes a parser which parses the OpenMath XML expressions emitted from yacas converting them to R so that the result can be an R expression whereas bc returns a native bc expression which may or may not by coincidence have the same syntax as R. Ryacas also has an R to yacas parser.
  • in principle Yacas could be running on a different faster machine. The communications between R and yacas is via TCP/IP and TCP/IP works across machines (that is what the internet uses) but even though the infrastructure exists currently Ryacas is set up to run yacas on the same machine so this advantage is currently theoretical.
  • yacas can compute exact results maintaining fractions as fractions whereas bc only computes high precision results

Section Speed comaprison presents a simple and not necessarily representative example, where bc was actually faster when running on the same machine as R.

misc/r_accuracy/high_precision_arithmetic.txt · Last modified: 2011/01/19 by psavicky
Recent changes RSS feed R Wiki powered by Driven by DokuWiki and optimized for Firefox Creative Commons License