The AMORE package: A MORE flexible neural network package

Overview

The AMORE package was born to provide the user with an unusual neural network simulator: a highly flexible environment that should allow the user to get direct access to the network parameters, providing more control over the learning details and allowing the user to customize the available functions in order to suit their needs. The current status (version 0.2-9) of the package is capable of training a multilayer feedforward network according to both the adaptive and the batch versions of the gradient descent with momentum backpropagation algorithm. Thanks to the structure adopted, expanding the number of available error criteria is as difficult as programming the corresponding R costs functions. Everyone is invited to grow the number of available functions and share their experiences.

Description of the package

The end users wanting to merely use common neural networks already have various and sundry good simulators to choose from. Most likely, among these, those familiar with R may wish to be able to train their nets without having to leave the R environment. The AMORE package may be useful for that aim, but in truth these users may find other faster alternatives. But for the researchers interested in neural networks and robust modeling, those wanting to gain fine control on the operating procedures of the training algorithms, this package offers a useful platform to program their own algorithms with relative ease.

The users may choose among three different degrees of involvement with the package: using the already available functions, programming their own functions playing in the easy R arena during the first steps and trials, and using the C programming language to speed up the training heavy tasks.

We hope that the careful reading of the functions may serve as a way of gaining a greater understanding about the training methods already programmed, as well as an inspiration about how to expand the current possibilities.

In order to ease the understanding of these readings, we will show in the following lines the main elements needed to make the whole thing work. We describe in the following sections the basic objects and their corresponding functions, using the net shown in the following figure for exemplification purposes: a multilayer feedforward network, featuring two hidden layers with four and two neurons, which considers two input variables in order to provide a bidimensional output. :packages:cran:redneuronal.png

The AMORE artificial neural network standard.

We have modeled the ANN using lists wherever a complex structure was needed. The preferred method to create the network is by using the newff function. For our example we would use something like:

 > net.start <- newff(n.neurons=c(2,4,2,2),learning.rate.global=1e-2, 
                     momentum.global=0.5, error.criterium="LMS", Stao=NA, 
                     hidden.layer="tansig", output.layer="purelin", method="ADAPTgdwm")

The resulting net contains the following elements, in the following order:

layers

A list of as many numerical vectors as layers are needed. Each vector contains the indexes of the neurons that belong to that layer. In the previous line of our example, we have especified the value c(2,4,2,2) for the n.neurons parameter: that is to say, that we want our net to have two input neurons, a first hidden layer with four neurons, a second hidden layer with two neurons and, finally, an output layer with two output neurons. The layers element of net.start should have the following contents:

 > net.start$layers
[[1]]
[1] -1 -2
 
[[2]]
[1] 1 2 3 4
 
[[3]]
[1] 5 6
 
[[4]]
[1] 7 8

Thus, we have four layers, the first containing the neurons number -1 and -2, the second with neurons numbered 1, 2, 3 and 4 and 5, the third with neurons 5 and 6, and the last with neurons 7 and 8. The reader may notice that the indexes of the two input neurons are negative. The input neurons are merely virtual neurons, they are conceptual supports to get access to the different input vector components. The absolute value of the indexes of these input neurons gives information about which particular component of the input vector we are referring to.

neurons

A list containing the neurons, which are lists in themselves as well, and deserve a deeper explanation. We provide it in the following section.

input

A numerical vector that contains the values of the input variables for a single case from the data set under consideration. It contains the input signals to be propagated through the network. In our example, it would contain the value of Input 1 and Input 2.

output

A numerical vector that contains the resulting predictions once the input signals have been propagated.

target

A numerical vector that contains the expected target values towards which the network’s output must approach.

deltaE

The cost function used to measure the errors amongst the outputs and the targets. It is written in R code so as to ease its modification according to the user needs. Currently we provide cost functions for the Least Mean Squares, Least Mean Log Squares and TAO robust criteria. In our example we especified the option error.criterium=”LMS”, thus we should be using the Least Mean Squares cost function:

 > net.start$deltaE
function (arguments)
{
    prediction <- arguments[[1]]
    target <- arguments[[2]]
    residual <- prediction - target
    return(residual)
}

As the reader may see, it is remarkably easy to modify this function so as to use a different criterium.

other.elements

This element is a list that contains auxiliary elements. Currently only the Stao parameter used by the TAO robust cost function is contained in this list.

The neuron: the basic entity.

We have chosen to represent an artificial neuron as a list whose elements contain the neuron weights, bias term, activation function (written in R code), and the necessary properties to allow the propagation of the signal from the inputs of the neuron to its output. The newff function calls the init.neuron function in order to properly create the network neurons. It allows the user to choose the neuron to be either an output neuron or a hidden one, to represent a linear activation function (pureline), a hyperbolic tangent (tansig) or a sigmoidal (sigmoid) activation function; or even define it as custom in case we would like to setup its activation function accordingly programming the R code of the function and its derivative. The neuron contains the following elements in the following order:

 > names(net.start$neurons[[1]])
[1]  "id"                   "type"                 
[3]  "activation.function"  "output.links"
[5]  "output.aims"          "input.links"
[7]  "weights"              "bias"        
[9]  "v0"                   "v1"          
[11] "f0"            "f1"
[13] "method"        "method.dep.variables"

We will spend a few lines describing the meaning of each element by means of the neurons of net.start.

id

The index of the neuron. It is supposed to be redundant since it should be numerically equal to the R index of the element in the neurons list. Not surprisingly, for the neurons considered:

 > net.start$neurons[[1]]$id
[1] 1
 > net.start$neurons[[5]]$id
[1] 5
 > net.start$neurons[[8]]$id
[1] 8

type

Information about the nature of the layer this neuron belongs to: It can be either hidden or output. We recall the reader that we have considered that the input neurons do not “really” exist.

 > net.start$neurons[[1]]$type
[1] "hidden"
 > net.start$neurons[[5]]$type
[1] "hidden"
 > net.start$neurons[[8]]$type
[1] "output"

activation.function

The name of the activation function that characterizes the neuron. Available functions are tansig, purelin, sigmoid and hardlim. The custom case allows the user willing to perform the needed changes on the f0 and f1 functions to use their own ones.

 > net.start$neurons[[1]]$activation.function
[1] "tansig"
 > net.start$neurons[[5]]$activation.function
[1] "tansig"
 > net.start$neurons[[8]]$activation.function
[1] "purelin"

output.links

The indexes of the neurons towards which the output of this neuron will be propagated, that is to say, the indexes of the neurons that use this neuron’s output as one of their inputs.

 > net.start$neurons[[1]]$output.links
[1] 5 6
 > net.start$neurons[[5]]$output.links
[1] 7 8
 > net.start$neurons[[8]]$output.links
[1] NA

Most frequently, output neurons do not point to any other neuron and a “NA” value reflects this.

output.aims

The neuron may use the outputs of many other neurons as inputs, and each input has to be weighted. This requires the inputs to be ordered, and this element accounts for that order: the position of this neuron’s output at the subsequent neuron’s input, according to the output.links.

 > net.start$neurons[[1]]$output.aims
[1] 1 1
 > net.start$neurons[[3]]$output.aims
[1] 3 3
 > net.start$neurons[[8]]$output.aims
[1] 2

input.links

The indexes of those neurons that feed this one.

 > net.start$neurons[[1]]$input.links
[1] -1 -2
 > net.start$neurons[[5]]$input.links
[1] 1 2 3 4
 > net.start$neurons[[8]]$input.links
[1] 5 6

weights

The weights of the conexions indicated by input.links.

 > net.start$neurons[[1]]$weights
[1] -0.32384606  0.09150842
 > net.start$neurons[[5]]$weights
[1]  0.31000780 -0.03621645
[3]  0.31094491 -0.25087121
 > net.start$neurons[[8]]$weights
[1] -0.24677257  0.07988028

bias

The value of the neuron’s bias term.

 > net.start$neurons[[1]]$bias
[1] 0.07560576
 > net.start$neurons[[5]]$bias
[1] -0.2522307
 > net.start$neurons[[8]]$bias
[1] -0.04253238

v0

It stores the last value provided by applying f0.

 > net.start$neurons[[1]]$v0
[1] 0
 > net.start$neurons[[5]]$v0
[1] 0
 > net.start$neurons[[8]]$v0
[1] 0

v1

It stores the last value provided by applying f1.

 > net.start$neurons[[1]]$v1
[1] 0
 > net.start$neurons[[5]]$v1
[1] 0
 > net.start$neurons[[8]]$v1
[1] 0

f0

The activation function. In our example, the neurons at the hidden layers have been defined as tansig, while the outputs are linear; thus, not surprinsingly:

 > net.start$neurons[[1]]$f0
function (v)
{
   a.tansig <- 1.71590470857554
   b.tansig <- 0.666666666666667
   return(a.tansig * tanh(v * b.tansig))
}
 
 > net.start$neurons[[5]]$f0
function (v)
{
   a.tansig <- 1.71590470857554
   b.tansig <- 0.666666666666667
   return(a.tansig * tanh(v * b.tansig))
}
 
 > net.start$neurons[[8]]$f0
function (v)
{
   return(v)
}

f1

The derivative of the activation function. Following our example,

 > net.start$neurons[[1]]$f1
function (v)
{
 a.tansig <- 1.71590470857554
 b.tansig <- 0.666666666666667
 return(a.tansig*b.tansig*(1-tanh(v*b.tansig)^2))
}
 
 > net.start$neurons[[5]]$f1
function (v)
{
  a.tansig <- 1.71590470857554
  b.tansig <- 0.666666666666667
  return(a.tansig * 
          b.tansig*(1-tanh(v*b.tansig)^2))
}

whilst the outputs are linear, so

 > net.start$neurons[[8]]$f1
function (v)
{
    return(1)
}

method

The training method. Currently, the user may choose amongst the adaptative and the batch mode of the gradient descent backpropagation training method, both with or without momentum: The names of the methods are ADAPTgd, ADAPTgdwm, BATCHgd, ...

  > net.start$neurons[[1]]$method
  [1] "ADAPTgdwm"
  > net.start$neurons[[5]]$method
  [1] "ADAPTgdwm"
  > net.start$neurons[[8]]$method
  [1] "ADAPTgdwm"

method.dep.variables

Those variables specifically needed by the training method. Their are shown in the table below.

  • delta: This element stores the correction effect due to the derivative of the cost deltaE over the weights and the bias. Depending on the neuron’s type, this is obtained simply by multiplying the value of deltaE times v1, for the output neurons; while for the hidden neurons, it is obtained through the multiplication of v1 times the summatory of the weights times the delta value of the other neurons pointed by this one.
  • learning.rate: It contains the learning rate value for this particular neuron. It is usually set to be equal to the learning.rate.global, but more sophisticated training methods may make use of this variable to assign different learning rate values to each neuron.
  • sum.delta.x: Used by the batch methods to accumulate the individual error effects during the forward pass over the whole training set so as to calculate the weight’s correction later during the backward pass.
  • sum.delta.bias: Similar to sum.delta.x, but now concerning the correction of the bias term.
  • momentum: Similarly to the learning.rate variable, this variable is usually equal to the momentum.global, but again, new training methods may make use of this variable to assign different momentum rate values to each neuron.
  • former.weight.change: It contains the former value of the weight change that was applied during the previous iteration.
  • former.bias.change: It contains the former values of the bias change that was applied during the previous iteration.
ADAPTgd ADAPTgdwm BATCHgd BATCHgdwm
delta delta delta delta
learning.rate learning.rate learning.rate learning.rate
momentum sum.delta.x sum.delta.x
former.weight.change sum.delta.bias sum.delta.bias
former.bias.change momentum
former.weight.change
former.bias.change

Training methods and method dependent variables.

Looking at our example:

 > net.start$neurons[[1]]$method.dep.variables
$delta
[1] 0
 
$learning.rate
[1] 0.01
 
$momentum
[1] 0.5
 
$former.weight.change
[1] 0 0
 
$former.bias.change
[1] 0
 
 > net.start$neurons[[5]]$method.dep.variables
$delta
[1] 0
 
$learning.rate
[1] 0.01
 
$momentum
[1] 0.5
 
$former.weight.change
[1] 0 0 0 0
 
$former.bias.change
[1] 0
 
 > net.start$neurons[[8]]$method.dep.variables
$delta
[1] 0
 
$learning.rate
[1] 0.01
 
$momentum
[1] 0.5
 
$former.weight.change
[1] 0 0
 
$former.bias.change
[1] 0

Training the net

Only a few functions are needed. We have the newff function to correctly create the net, the train function to train it, and the sim.MLPnet function to simulate the response. We can alse make use of the training.report function, editing and customizing it so as to provide our prefered results during the training process, or even to plot some graphics. This can be done, for example, while editing the function, fix(training.report), by just substituting line 2 which reads:

    P.sim <- sim.MLPnet(net, P)

for the corresponding plotting commands:

    P.sim <- sim.MLPnet(net, P)
    plot(P,T, pch="+")
    points(P,P.sim, col="red", pch="+")

That change will provide, dimension of the data permitting, the desired graphical output every show.step. as defined in the train function. In the following paragraph we provide the commands to train a simple network:

require(AMORE)
 
## We create two artificial data sets. ''P'' is the input data set. ''target'' is the output.
P <- matrix(sample(seq(-1,1,length=500), 500, replace=FALSE), ncol=1)
target <- P^2 + rnorm(500, 0, 0.5)
 
## We create the neural network object
net.start <- newff(n.neurons=c(1,3,1),      
             learning.rate.global=1e-2,        
             momentum.global=0.5,              
             error.criterium="LMS",           
             Stao=NA, hidden.layer="tansig",   
             output.layer="purelin",           
             method="ADAPTgdwm") 
 
## We train the network according to P and target.
result <- train(net.start, P, target, error.criterium="LMS", report=TRUE, show.step=100, n.shows=5 )
 
## Several graphs, mainly to remark that 
## now the trained network is is an element of the resulting list.
y <- sim(result$net, P)
plot(P,y, col="blue", pch="+")
points(P,target, col="red", pch="x")

The following pictures show two resulting plots. :packages:cran:chu.gif Epoch number 750. :packages:cran:zri.gif Epoch number 1000.

Future Directions

The natural development would be to provide the package with more sophisticated training methods and to speed up the training functions, mostly the C code, while keeping the flexibility. Right now, we are developing an extension of AMORE so as to provide support for RBF networks, which we hope to deliver soon.

Acknowledgments

The authors gratefully acknowledge the financial support of the Ministerio de Educación y Ciencia through project grants DPI2006–14784, DPI-2006-02454, DPI2006-03060 and DPI2007-61090; of the European Union through project grant RFSR-CT-2008-00034; and of the Autonomous Government of La Rioja for its support through the 3rd Plan Riojano de I+D+i.

Released package info

  • Version: 0.2-11
  • Date: 2009-02-19
  • Authors: Manuel Castejón Limas, Joaquí­n B. Ordieres Meré, Francisco Javier Martí­nez de Pisón Ascacibar, Alpha V. Perní­a Espinoza, Fernando Alba Elí­as, Ana González Marcos, Eliseo P. Vergara González,
  • Maintainer: Manuel Castejón Limas
  • License: GPL version 2 or newer.
  • Correspondence Author:
Manuel Castejón Limas,
Área de Proyectos de Ingeniería. Universidad de León.
Escuela de Ingenierías Industrial e Informática.
Campus de Vegazana sn. León. Castilla y León.
Spain.
e-mail: manuel.castejon at unileon.es
 
packages/cran/amore.txt · Last modified: 2009/05/27
 
Recent changes RSS feed R Wiki powered by Driven by DokuWiki and optimized for Firefox Creative Commons License