tcl−fann − A Tcl extension for Artificial Neural Networks |
package require fann gaul create name ?−sparse connection_rate | −shortcut? layers layer1 layer2 ... gaul load name filepath name init ?min_weight max_weight? name params name params ?−training_algorithm incremental | batch | rprop | quickprop? ?−learning_rate num? ?−learning_momentum num? ?−activation_function linear | threshold | threshold_symmetric | sigmoid | sigmoid_stepwise | sigmoid_symmetric | sigmoid_symmetric_stepwise | gaussian | gaussian_symmetric | elliot | elliot_symmetric | linear_piece | linear_piece_symmetric | sin_symmetric | cos_symmetric | sin | cos | gaussian_stepwise? ?−activation_steepness num? ?−train_error_function linear | tanh? ?−train_stop_function mse | bit? ?−bit_fail_limit num? ?−quickprop_decay num? ?−quickprop_mu num? ?−rprop_increase_factor num? ?−rprop_decrease_factor num? ?−rprop_delta_min num? ?−rprop_delta_max num? ?−rprop_delta_zero num? name train input output name test input output name run input name error ?bitfail? name save filepath name info name trainondata epochs error input output name trainonfile filepath epochs error name function hidden function name function output function name function layer layer function name steepness hidden steepness name steepness output steepness name steepness layer layer steepness name copy newname name destroy rename name {} |
A Tcl extension for Artificial Neural Networks. An artificial neural network (ANN), often just called a "neural network" (NN), is a mathematical model or computational model based on biological neural networks. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. In more practical terms neural networks are non−linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data. For an introduction to ANN visit: http://leenissen.dk/fann/html/files2/theory−txt.html This extension enables artificial neural networks processing in Tcl. It is using the FANN (Fast Artificial Neural Networks) library underneath. The FANN Library is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross−platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. tcl−fann supports: * Fully, sparse and shortcut type connected multi−layer networks. * Backpropagation training (RPROP, Quickprop, Batch, Incremental) * Evolving topology training which dynamically builds and trains the ANN (Cascade) * Versatile (possible to adjust many parameters and features on−the−fly) * Several different activation functions implemented (including stepwise linear functions for that extra bit of speed) * Saving and loading of entire ANNs * Cross−platform (linux/unix & MS Windows) |
gaul create name ?−sparse connection_rate | −shortcut? layers layer1 layer2 ... |
Create a new ANN named name with layers number of layers and layer1 layer2 ... number of neurons per layer respectively, starting from the input layer and towards the output layer. Without any switches a regular ANN is created, where each neuron is connected with every neuron of the next layer. Using the −sparse switch however, you can reduce the number of connections as dictated by connection_rate. For example, a connection rate of 0.5 means that every neuron will be connected with half of the neurons of the next layer. The exact neurons to connect to are chosen randomly. On the other hand, the −shortcut results in a network in which every neuron is connected with all neurons of all the following layers. |
gaul load name filepath |
Load and create a new ANN from the file filepath, named name. This command should be used to load ANN that have been previously saved with the command: name save filepath |
name init ?min_weight max_weight? |
Randomize all weights between −0.1 and 0.1, or alternatively between min_weight and max_weight if specified. |
name params |
Get various mutable parameters of the ANN. See below. |
name params ?−training_algorithm incremental | batch | rprop | quickprop? ?−learning_rate num? ?−learning_momentum num? ?−activation_function linear | threshold | threshold_symmetric | sigmoid | sigmoid_stepwise | sigmoid_symmetric | sigmoid_symmetric_stepwise | gaussian | gaussian_symmetric | elliot | elliot_symmetric | linear_piece | linear_piece_symmetric | sin_symmetric | cos_symmetric | sin | cos | gaussian_stepwise ...? ?−activation_steepness num ...? ?−train_error_function linear | tanh? ?−train_stop_function mse | bit? ?−bit_fail_limit num? ?−quickprop_decay num? ?−quickprop_mu num? ?−rprop_increase_factor num? ?−rprop_decrease_factor num? ?−rprop_delta_min num? ?−rprop_delta_max num? ?−rprop_delta_zero num? |
Set various configuration aspects of the ANN. |
−training_algorithm |
Set the training algorithm. The default training algorithm is rprop. |
See TRAINING ALGORITHMS for more details. |
−learning_rate |
Set the learning rate. The learning rate is used to determine how aggressive training should be for some of the training algorithms (incremental, batch, quickprop). Do however note that it is not used in rprop. |
−learning_momentum |
Set the learning momentum. The learning momentum can be used to speed up incremental training. A too high momentum will however not benefit training. Setting momentum to 0 will be the same as not using the momentum parameter. The recommended value of this parameter is between 0.0 and 1.0. The default momentum is 0. |
−activation_function |
Set the activation function for each neuron in the ANN, except for the neurons in the input layer for which it cannot be set. This switch receives a list as an argument that lists the activation functions of each neuron counting top−down the neurons of each layer, starting from the layer next to the input layer and till the output layer. A more convenient way to set the activation function for some neurons in the network is to use the function subcommand. |
When choosing an activation function it is important to note that the activation functions have different range. sigmoid is e.g. in the 0 − 1 range while sigmoid_symmetric is in the −1 − 1 range and linear is unbound. The default activation function is sigmoid_stepwise. Information about the individual activation functions is available at ACTIVATION FUNCTIONS section. |
−activation_steepness |
Set the steepness of the activation function for each neuron in the ANN, except for the neurons in the input layer for which it cannot be set. This switch receives a list as an argument that lists the steepnesss value for each activation function of each neuron counting top−down the neurons of each layer, starting from the layer next to the input layer and till the output layer. A more convenient way to set the steepness is by using the steepness subcommand. |
The steepness of an activation function says something about how fast the activation function goes from the minimum to the maximum. A high value for the activation function will also give a more agressive training. When training neural networks where the output values should be at the extremes (usually 0 and 1, depending on the activation function), a steep activation function can be used (e.g. 1.0). The default activation steepness is 0.5. |
−train_error_function |
Set the error function to be used. This function is used to calculate the error during training. See ERROR FUNCTIONS for details. |
−train_stop_function |
Set the error stop function to be used. This function is used to determine when training should be terminated. See ERROR STOP FUNCTIONS for details. |
−bit_fail_limit |
Set the bit fail limit used during training. |
The bit fail limit is used during training when the −train_stop_function is set to bit. The limit is the maximum accepted difference between the desired output and the actual output during training. Each output that diverges more than this limit is counted as an error bit. This difference is divided by two when dealing with symmetric activation functions, so that symmetric and not symmetric activation functions can use the same limit. The default bit fail limit is 0.35. |
−quickprop_decay |
The decay is a small negative valued number which is the factor that the weights should become smaller in each iteration during quickprop training. This is used to make sure that the weights do not become too high during training. The default decay is −0.0001. |
−quickprop_mu |
The mu factor is used to increase and decrease the step−size during quickprop training. The mu factor should always be above 1, since it would otherwise decrease the step−size when it was suppose to increase it. |
−rprop_increase_factor |
The increase factor is a value larger than 1, which is used to increase the step−size during RPROP training. The default increase factor is 1.2. |
−rprop_decrease_factor |
The decrease factor is a value smaller than 1, which is used to decrease the step−size during RPROP training. The default decrease factor is 0.5. |
−rprop_delta_min |
The minimum step−size is a small positive number determining how small the minimum step−size may be. The default value delta min is 0.0. |
−rprop_delta_max |
The maximum step−size is a positive number determining how large the maximum step−size may be. The default delta max is 50.0. |
−rprop_delta_zero |
The initial step−size is a positive number determining the initial step size. The default delta zero is 0.1. |
name train input output |
Run the training algorithm once, for a single input/output correspondance. |
name test input output |
Test the ANN against a single input/output correspondance. The error can later be retrieved with the error subcommand. |
name run input |
Return the output generated by the ANN for input input. |
name error ?bitfail? |
Return the error from the last training, in the form of Mean Square Error or Bitfail. See ERROR FUNCTIONS. |
name save filepath |
Save the ANN with all its properties and current state of the weights in filepath. |
name info |
Generate and return a dictionary that includes various immutable information about the ANN. The following dictionary keys are defined: |
network_type |
Either layer or shortcut. |
total_neurons |
Total number of neurons in the ANN. |
total_nonnections |
Total number of connections in the ANN. |
connection_rate |
The connection rate of each neuron with the neurons of the next layer(s). A connection rate of 1.0 corresponds to full connectivity. |
neurons_per_layer |
Number of neurons per layer, starting from the input layer. |
bias_per_layer |
The bias at each layer, starting from the input layer. |
connections |
A list that describes the architecture of the whole ANN.
It consists of sublists with the following structure: |
name trainondata epochs error input output |
Train the ANN on the input/output sets contained in the
lists input output respectively. |
name trainonfile filepath epochs error |
Train the ANN on the input/output sets contained in file
filepath. The format of this file is as
follows: |
name function hidden function |
Set the activation function in all of the hidden layers. When choosing an activation function it is important to note that the activation functions have different range. sigmoid is e.g. in the 0 − 1 range while sigmoid_symmetric is in the −1 − 1 range and linear is unbound. Information about the individual activation functions is available at the ACTIVATIONS FUNTIONS section. The default activation function is sigmoid_stepwise. |
name function output function |
Set the activation function in the output layer. |
name function layer layer function |
Set the activation function of the neurons in layer layer, counting the input layer as layer 0. It is not possible to set an activation function for the neurons in the input layer, therefore layer can take values from 1 up to the number of layers in the ANN. |
name steepness hidden steepness |
Set the steepness of the activation function in all of the hidden layers. The steepness of an activation function says something about how fast the activation function goes from the minimum to the maximum. A high value for the activation function will also give a more agressive training. When training neural networks where the output values should be at the extremes (usually 0 and 1, depending on the activation function), a steep activation function can be used (e.g. 1.0). The default activation steepness is 0.5. |
name steepness output steepness |
Set the steepness of the activation steepness in the output layer. |
name steepness layer layer steepness |
Set the activation steepness of the neurons in layer layer, counting the input layer as layer 0. It is not possible to set activation steepness for the neurons in the input layer, therefore layer can take values from 1 up to the number of layers in the ANN. |
name copy newname |
Make an exact copy of name named newname. The two ANN are completely independent from each other, and can follow a different course from now on. |
name destroy |
Destroy the ANN and free memory. |
incremental |
Standard backpropagation algorithm, where the weights are updated after each training pattern. This means that the weights are updated many times during a single epoch. For this reason some problems, will train very fast with this algorithm, while other more advanced problems will not train very well. |
batch |
Standard backpropagation algorithm, where the weights are updated after calculating the mean square error for the whole training set. This means that the weights are only updated once during a epoch. For this reason some problems, will train slower with this algorithm. But since the mean square error is calculated more correctly than in incremental training, some problems will reach a better solutions with this algorithm. |
||
rprop |
A more advanced batch training algorithm which achieves good results for many problems. The RPROP training algorithm is adaptive, and does therefore not use the learning_rate. Some other parameters can however be set to change the way the RPROP algorithm works, but it is only recommended for users with insight in how the RPROP training algorithm works. The RPROP training algorithm is described by [Riedmiller and Braun, 1993], but the actual learning algorithm used here is the iRPROP− training algorithm which is described by [Igel and Husken, 2000] which is an variety of the standard RPROP training algorithm. |
quickprop |
A more advanced batch training algorithm which achieves good results for many problems. The quickprop training algorithm uses the learning_rate parameter along with other more advanced parameters, but it is only recommended to change these advanced parameters, for users with insight in how the quickprop training algorithm works. The quickprop training algorithm is described by [Fahlman, 1988]. |
The functions are described with functions where: |
x |
is the input to the activation function, |
|||
y |
is the output, |
|||
s |
is the steepness and |
|||
d |
is the derivation. |
|||
linear |
Linear activation function. |
span: −inf < y < inf y = x*s, d = 1*s Can NOT be used in fixed point. |
threshold |
Threshold activation function. x < 0 −> y = 0, x >= 0 −> y = 1 Can NOT be used during training. |
threshold_symmetric |
Threshold activation function. x < 0 −> y = 0, x >= 0 −> y = 1 Can NOT be used during training. |
sigmoid |
Sigmoid activation function. One of the most used activation functions. span: 0 < y < 1 y = 1/(1 + exp(−2*s*x)) d = 2*s*y*(1 − y) |
sigmoid_stepwise |
Stepwise linear approximation to sigmoid. Faster than sigmoid but a bit less precise. |
sigmoid_symmetric |
Symmetric sigmoid activation function, aka. tanh. One of the most used activation functions. span: −1 < y < 1 y = tanh(s*x) = 2/(1 + exp(−2*s*x)) − 1 d = s*(1−(y*y)) |
sigmoid_symmetric_stepwise |
Stepwise linear approximation to symmetric sigmoid. Faster than symmetric sigmoid but a bit less precise. |
gaussian |
Gaussian activation function. 0 when x = −inf, 1 when x = 0 and 0 when x = inf span: 0 < y < 1 y = exp(−x*s*x*s) d = −2*x*s*y*s |
gaussian_symmetric |
Symmetric gaussian activation function. −1 when x = −inf, 1 when x = 0 and 0 when x = inf span: −1 < y < 1 y = exp(−x*s*x*s)*2−1 d = −2*x*s*(y+1)*s |
elliot |
Fast (sigmoid like) activation function defined by David Elliott |
span: 0 < y < 1 y = ((x*s) / 2) / (1 + |x*s|) + 0.5 d = s*1/(2*(1+|x*s|)*(1+|x*s|)) |
elliot_symmetric |
Fast (symmetric sigmoid like) activation function defined by David Elliott span: −1 < y < 1 y = (x*s) / (1 + |x*s|) d = s*1/((1+|x*s|)*(1+|x*s|)) |
linear_piece |
Bounded linear activation function. span: 0 <= y <= 1 y = x*s, d = 1*s |
linear_piece_symmetric |
Bounded linear activation function. span: −1 <= y <= 1 y = x*s, d = 1*s |
sin_symmetric |
Periodical sinus activation function. span: −1 <= y <= 1 y = sin(x*s) d = s*cos(x*s) |
cos_symmetric |
Periodical cosinus activation function. span: −1 <= y <= 1 y = cos(x*s) d = s*−sin(x*s) |
sin |
Periodical sinus activation function. |
span: 0 <= y <= 1 y = sin(x*s)/2+0.5 d = s*cos(x*s)/2 |
cos |
Periodical cosinus activation function. |
span: 0 <= y <= 1 y = cos(x*s)/2+0.5 d = s*−sin(x*s)/2 |
Error function used during training. |
linear |
Standard linear error function. |
||
tanh |
Tanh error function, usually better but can require a lower learning rate. This error function agressively targets outputs that differ much from the desired, while not targetting outputs that only differ a little that much. This activation function is not recommended for cascade training and incremental training. |
Stop criteria used during training. |
mse |
Stop criteria is Mean Square Error (MSE) value. |
||
bit |
Stop criteria is number of bits that fail. The number of bits; means the number of output neurons which differ more than the bit fail limit. The bits are counted in all of the training data, so this number can be higher than the number of training data. |
# The typical XOR example. package require fann fann create xor 3 2 3 1 xor params −activation_function {sigmoid_symmetric sigmoid_symmetric sigmoid_symmetric sigmoid_symmetric} xor function hidden sigmoid_symmetric xor function output sigmoid_symmetric xor function layer 1 sigmoid_symmetric xor function layer 2 sigmoid_symmetric xor trainondata 500000 0.001 {{−1 −1} {−1 1} {1 −1} {1 1}} {−1 1 1 −1} puts 1:1:[format "%.5f" [xor run {1 1}]] puts 1:−1:[format "%.5f" [xor run {1 −1}]] puts −1:1:[format "%.5f" [xor run {−1 1}]] puts −1:−1:[format "%.5f" [xor run {−1 −1}]] xor destroy |
1:1:−0.95379 1:−1:0.92959 −1:1:0.95144 −1:−1:−0.99688 |
Alexandros Stergiakis <sterg@kth.se> |
Copyright (C) 2008 Alexandros Stergiakis This program is free software: you can redistribute it
and/or This program is distributed in the hope that it will be
useful, You should have received a copy of the GNU General Public
License |