tcl-fann

NAME
SYNOPSIS
INTRODUCTION
DESCRIPTION
TRAINING ALGORITHMS
ACTIVATION FUNCTIONS
ERROR FUNCTIONS
ERROR STOP FUNCTIONS
EXAMPLE
OUTPUT
AUTHOR
COPYRIGHT

NAME

tcl−fann − A Tcl extension for Artificial Neural Networks

SYNOPSIS

package require fann

gaul create name ?−sparse connection_rate | −shortcut? layers layer1 layer2 ...

gaul load name filepath

name init ?min_weight max_weight?

name params

name params ?−training_algorithm incremental | batch | rprop | quickprop? ?−learning_rate num? ?−learning_momentum num? ?−activation_function linear | threshold | threshold_symmetric | sigmoid | sigmoid_stepwise | sigmoid_symmetric | sigmoid_symmetric_stepwise | gaussian | gaussian_symmetric | elliot | elliot_symmetric | linear_piece | linear_piece_symmetric | sin_symmetric | cos_symmetric | sin | cos | gaussian_stepwise? ?−activation_steepness num? ?−train_error_function linear | tanh? ?−train_stop_function mse | bit? ?−bit_fail_limit num? ?−quickprop_decay num? ?−quickprop_mu num? ?−rprop_increase_factor num? ?−rprop_decrease_factor num? ?−rprop_delta_min num? ?−rprop_delta_max num? ?−rprop_delta_zero num?

name train input output

name test input output

name run input

name error ?bitfail?

name save filepath

name info

name trainondata epochs error input output

name trainonfile filepath epochs error

name function hidden function

name function output function

name function layer layer function

name steepness hidden steepness

name steepness output steepness

name steepness layer layer steepness

name copy newname

name destroy

rename name {}

INTRODUCTION

A Tcl extension for Artificial Neural Networks.

An artificial neural network (ANN), often just called a "neural network" (NN), is a mathematical model or computational model based on biological neural networks. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. In more practical terms neural networks are non−linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.

For an introduction to ANN visit: http://leenissen.dk/fann/html/files2/theory−txt.html

This extension enables artificial neural networks processing in Tcl. It is using the FANN (Fast Artificial Neural Networks) library underneath. The FANN Library is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross−platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast.

tcl−fann supports: * Fully, sparse and shortcut type connected multi−layer networks. * Backpropagation training (RPROP, Quickprop, Batch, Incremental) * Evolving topology training which dynamically builds and trains the ANN (Cascade) * Versatile (possible to adjust many parameters and features on−the−fly) * Several different activation functions implemented (including stepwise linear functions for that extra bit of speed) * Saving and loading of entire ANNs * Cross−platform (linux/unix & MS Windows)

DESCRIPTION

gaul create name ?−sparse connection_rate | −shortcut? layers layer1 layer2 ...

Create a new ANN named name with layers number of layers and layer1 layer2 ... number of neurons per layer respectively, starting from the input layer and towards the output layer. Without any switches a regular ANN is created, where each neuron is connected with every neuron of the next layer. Using the −sparse switch however, you can reduce the number of connections as dictated by connection_rate. For example, a connection rate of 0.5 means that every neuron will be connected with half of the neurons of the next layer. The exact neurons to connect to are chosen randomly. On the other hand, the −shortcut results in a network in which every neuron is connected with all neurons of all the following layers.

gaul load name filepath

Load and create a new ANN from the file filepath, named name. This command should be used to load ANN that have been previously saved with the command: name save filepath

name init ?min_weight max_weight?

Randomize all weights between −0.1 and 0.1, or alternatively between min_weight and max_weight if specified.

name params

Get various mutable parameters of the ANN. See below.

name params ?−training_algorithm incremental | batch | rprop | quickprop? ?−learning_rate num? ?−learning_momentum num? ?−activation_function linear | threshold | threshold_symmetric | sigmoid | sigmoid_stepwise | sigmoid_symmetric | sigmoid_symmetric_stepwise | gaussian | gaussian_symmetric | elliot | elliot_symmetric | linear_piece | linear_piece_symmetric | sin_symmetric | cos_symmetric | sin | cos | gaussian_stepwise ...? ?−activation_steepness num ...? ?−train_error_function linear | tanh? ?−train_stop_function mse | bit? ?−bit_fail_limit num? ?−quickprop_decay num? ?−quickprop_mu num? ?−rprop_increase_factor num? ?−rprop_decrease_factor num? ?−rprop_delta_min num? ?−rprop_delta_max num? ?−rprop_delta_zero num?

Set various configuration aspects of the ANN.

−training_algorithm

Set the training algorithm. The default training algorithm is rprop.

See TRAINING ALGORITHMS for more details.

−learning_rate

Set the learning rate. The learning rate is used to determine how aggressive training should be for some of the training algorithms (incremental, batch, quickprop). Do however note that it is not used in rprop.

−learning_momentum

Set the learning momentum. The learning momentum can be used to speed up incremental training. A too high momentum will however not benefit training. Setting momentum to 0 will be the same as not using the momentum parameter. The recommended value of this parameter is between 0.0 and 1.0. The default momentum is 0.

−activation_function

Set the activation function for each neuron in the ANN, except for the neurons in the input layer for which it cannot be set. This switch receives a list as an argument that lists the activation functions of each neuron counting top−down the neurons of each layer, starting from the layer next to the input layer and till the output layer. A more convenient way to set the activation function for some neurons in the network is to use the function subcommand.

When choosing an activation function it is important to note that the activation functions have different range. sigmoid is e.g. in the 0 − 1 range while sigmoid_symmetric is in the −1 − 1 range and linear is unbound. The default activation function is sigmoid_stepwise.

Information about the individual activation functions is available at ACTIVATION FUNCTIONS section.

−activation_steepness

Set the steepness of the activation function for each neuron in the ANN, except for the neurons in the input layer for which it cannot be set. This switch receives a list as an argument that lists the steepnesss value for each activation function of each neuron counting top−down the neurons of each layer, starting from the layer next to the input layer and till the output layer. A more convenient way to set the steepness is by using the steepness subcommand.

The steepness of an activation function says something about how fast the activation function goes from the minimum to the maximum. A high value for the activation function will also give a more agressive training. When training neural networks where the output values should be at the extremes (usually 0 and 1, depending on the activation function), a steep activation function can be used (e.g. 1.0). The default activation steepness is 0.5.

−train_error_function

Set the error function to be used. This function is used to calculate the error during training. See ERROR FUNCTIONS for details.

−train_stop_function

Set the error stop function to be used. This function is used to determine when training should be terminated. See ERROR STOP FUNCTIONS for details.

−bit_fail_limit

Set the bit fail limit used during training.

The bit fail limit is used during training when the −train_stop_function is set to bit. The limit is the maximum accepted difference between the desired output and the actual output during training. Each output that diverges more than this limit is counted as an error bit. This difference is divided by two when dealing with symmetric activation functions, so that symmetric and not symmetric activation functions can use the same limit. The default bit fail limit is 0.35.

−quickprop_decay

The decay is a small negative valued number which is the factor that the weights should become smaller in each iteration during quickprop training. This is used to make sure that the weights do not become too high during training. The default decay is −0.0001.

−quickprop_mu

The mu factor is used to increase and decrease the step−size during quickprop training. The mu factor should always be above 1, since it would otherwise decrease the step−size when it was suppose to increase it.

−rprop_increase_factor

The increase factor is a value larger than 1, which is used to increase the step−size during RPROP training. The default increase factor is 1.2.

−rprop_decrease_factor

The decrease factor is a value smaller than 1, which is used to decrease the step−size during RPROP training. The default decrease factor is 0.5.

−rprop_delta_min

The minimum step−size is a small positive number determining how small the minimum step−size may be. The default value delta min is 0.0.

−rprop_delta_max

The maximum step−size is a positive number determining how large the maximum step−size may be. The default delta max is 50.0.

−rprop_delta_zero

The initial step−size is a positive number determining the initial step size. The default delta zero is 0.1.

name train input output

Run the training algorithm once, for a single input/output correspondance.

name test input output

Test the ANN against a single input/output correspondance. The error can later be retrieved with the error subcommand.

name run input

Return the output generated by the ANN for input input.

name error ?bitfail?

Return the error from the last training, in the form of Mean Square Error or Bitfail. See ERROR FUNCTIONS.

name save filepath

Save the ANN with all its properties and current state of the weights in filepath.

name info

Generate and return a dictionary that includes various immutable information about the ANN. The following dictionary keys are defined:

network_type

Either layer or shortcut.

total_neurons

Total number of neurons in the ANN.

total_nonnections

Total number of connections in the ANN.

connection_rate

The connection rate of each neuron with the neurons of the next layer(s). A connection rate of 1.0 corresponds to full connectivity.

neurons_per_layer

Number of neurons per layer, starting from the input layer.

bias_per_layer

The bias at each layer, starting from the input layer.

connections

A list that describes the architecture of the whole ANN. It consists of sublists with the following structure:
from_neuron to_neuron weigth

name trainondata epochs error input output

Train the ANN on the input/output sets contained in the lists input output respectively.
The format of the input list is:
{{set_1_input_1 set_1_input_2 ...} {set_2_input_1 set_2_input_2 ...} ...}
For the output list:
{{set_1_output_1 set_1_output_2 ...} {set_2_output_1 set_2_output_2 ...} ...}

name trainonfile filepath epochs error

Train the ANN on the input/output sets contained in file filepath. The format of this file is as follows:
number_of_sets_in_the_file number_of_inputs_of_the_ANN number_of_outputs_of_the_ANN
set_1_input_1 set_1_input_2 set_1_input_3
...
set_1_output_1 set_1_output_2 set_1_output_3
...
set_2_input_1 set_2_input_2 set_2_input_3
...
set_2_output_1 set_2_output_2 set_2_output_3
...

name function hidden function

Set the activation function in all of the hidden layers.

When choosing an activation function it is important to note that the activation functions have different range. sigmoid is e.g. in the 0 − 1 range while sigmoid_symmetric is in the −1 − 1 range and linear is unbound.

Information about the individual activation functions is available at the ACTIVATIONS FUNTIONS section.

The default activation function is sigmoid_stepwise.

name function output function

Set the activation function in the output layer.

name function layer layer function

Set the activation function of the neurons in layer layer, counting the input layer as layer 0.

It is not possible to set an activation function for the neurons in the input layer, therefore layer can take values from 1 up to the number of layers in the ANN.

name steepness hidden steepness

Set the steepness of the activation function in all of the hidden layers.

The steepness of an activation function says something about how fast the activation function goes from the minimum to the maximum. A high value for the activation function will also give a more agressive training.

When training neural networks where the output values should be at the extremes (usually 0 and 1, depending on the activation function), a steep activation function can be used (e.g. 1.0).

The default activation steepness is 0.5.

name steepness output steepness

Set the steepness of the activation steepness in the output layer.

name steepness layer layer steepness

Set the activation steepness of the neurons in layer layer, counting the input layer as layer 0.

It is not possible to set activation steepness for the neurons in the input layer, therefore layer can take values from 1 up to the number of layers in the ANN.

name copy newname

Make an exact copy of name named newname. The two ANN are completely independent from each other, and can follow a different course from now on.

name destroy

Destroy the ANN and free memory.

TRAINING ALGORITHMS

incremental

Standard backpropagation algorithm, where the weights are updated after each training pattern. This means that the weights are updated many times during a single epoch. For this reason some problems, will train very fast with this algorithm, while other more advanced problems will not train very well.

batch

Standard backpropagation algorithm, where the weights are updated after calculating the mean square error for the whole training set. This means that the weights are only updated once during a epoch. For this reason some problems, will train slower with this algorithm. But since the mean square error is calculated more correctly than in incremental training, some problems will reach a better solutions with this algorithm.

rprop

A more advanced batch training algorithm which achieves good results for many problems. The RPROP training algorithm is adaptive, and does therefore not use the learning_rate. Some other parameters can however be set to change the way the RPROP algorithm works, but it is only recommended for users with insight in how the RPROP training algorithm works. The RPROP training algorithm is described by [Riedmiller and Braun, 1993], but the actual learning algorithm used here is the iRPROP− training algorithm which is described by [Igel and Husken, 2000] which is an variety of the standard RPROP training algorithm.

quickprop

A more advanced batch training algorithm which achieves good results for many problems. The quickprop training algorithm uses the learning_rate parameter along with other more advanced parameters, but it is only recommended to change these advanced parameters, for users with insight in how the quickprop training algorithm works. The quickprop training algorithm is described by [Fahlman, 1988].

ACTIVATION FUNCTIONS

The functions are described with functions where:

x

is the input to the activation function,

y

is the output,

s

is the steepness and

d

is the derivation.

linear

Linear activation function.

span: −inf < y < inf

y = x*s, d = 1*s

Can NOT be used in fixed point.

threshold

Threshold activation function.

x < 0 −> y = 0, x >= 0 −> y = 1

Can NOT be used during training.

threshold_symmetric

Threshold activation function.

x < 0 −> y = 0, x >= 0 −> y = 1

Can NOT be used during training.

sigmoid

Sigmoid activation function.

One of the most used activation functions.

span: 0 < y < 1

y = 1/(1 + exp(−2*s*x))

d = 2*s*y*(1 − y)

sigmoid_stepwise

Stepwise linear approximation to sigmoid.

Faster than sigmoid but a bit less precise.

sigmoid_symmetric

Symmetric sigmoid activation function, aka. tanh.

One of the most used activation functions.

span: −1 < y < 1

y = tanh(s*x) = 2/(1 + exp(−2*s*x)) − 1

d = s*(1−(y*y))

sigmoid_symmetric_stepwise

Stepwise linear approximation to symmetric sigmoid.

Faster than symmetric sigmoid but a bit less precise.

gaussian

Gaussian activation function.

0 when x = −inf, 1 when x = 0 and 0 when x = inf

span: 0 < y < 1

y = exp(−x*s*x*s)

d = −2*x*s*y*s

gaussian_symmetric

Symmetric gaussian activation function.

−1 when x = −inf, 1 when x = 0 and 0 when x = inf

span: −1 < y < 1

y = exp(−x*s*x*s)*2−1

d = −2*x*s*(y+1)*s

elliot

Fast (sigmoid like) activation function defined by David Elliott

span: 0 < y < 1

y = ((x*s) / 2) / (1 + |x*s|) + 0.5

d = s*1/(2*(1+|x*s|)*(1+|x*s|))

elliot_symmetric

Fast (symmetric sigmoid like) activation function defined by David Elliott

span: −1 < y < 1

y = (x*s) / (1 + |x*s|)

d = s*1/((1+|x*s|)*(1+|x*s|))

linear_piece

Bounded linear activation function.

span: 0 <= y <= 1

y = x*s, d = 1*s

linear_piece_symmetric

Bounded linear activation function.

span: −1 <= y <= 1

y = x*s, d = 1*s

sin_symmetric

Periodical sinus activation function.

span: −1 <= y <= 1

y = sin(x*s)

d = s*cos(x*s)

cos_symmetric

Periodical cosinus activation function.

span: −1 <= y <= 1

y = cos(x*s)

d = s*−sin(x*s)

sin

Periodical sinus activation function.

span: 0 <= y <= 1

y = sin(x*s)/2+0.5

d = s*cos(x*s)/2

cos

Periodical cosinus activation function.

span: 0 <= y <= 1

y = cos(x*s)/2+0.5

d = s*−sin(x*s)/2

ERROR FUNCTIONS

Error function used during training.

linear

Standard linear error function.

tanh

Tanh error function, usually better but can require a lower learning rate. This error function agressively targets outputs that differ much from the desired, while not targetting outputs that only differ a little that much. This activation function is not recommended for cascade training and incremental training.

ERROR STOP FUNCTIONS

Stop criteria used during training.

mse

Stop criteria is Mean Square Error (MSE) value.

bit

Stop criteria is number of bits that fail. The number of bits; means the number of output neurons which differ more than the bit fail limit. The bits are counted in all of the training data, so this number can be higher than the number of training data.

EXAMPLE

# The typical XOR example.
package require fann

fann create xor 3 2 3 1

xor params −activation_function {sigmoid_symmetric sigmoid_symmetric sigmoid_symmetric sigmoid_symmetric}
xor function hidden sigmoid_symmetric
xor function output sigmoid_symmetric
xor function layer 1 sigmoid_symmetric
xor function layer 2 sigmoid_symmetric

xor trainondata 500000 0.001 {{−1 −1} {−1 1} {1 −1} {1 1}} {−1 1 1 −1}

puts 1:1:[format "%.5f" [xor run {1 1}]]
puts 1:−1:[format "%.5f" [xor run {1 −1}]]
puts −1:1:[format "%.5f" [xor run {−1 1}]]
puts −1:−1:[format "%.5f" [xor run {−1 −1}]]

xor destroy

OUTPUT

1:1:−0.95379
1:−1:0.92959
−1:1:0.95144
−1:−1:−0.99688

AUTHOR

Alexandros Stergiakis <sterg@kth.se>

COPYRIGHT

Copyright (C) 2008 Alexandros Stergiakis

This program is free software: you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of
the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.