Training your network is not that hard to do - it's the preparation that is harder.
First of all, you should normalize your data. That means you have to convert all your input and output data to values within a range of 0
to 1
. If you are unsure how to do this, visit the Normalization page. Each training sample in your training set should be an object that looks as follows:
{ input: [], output: [] }
So an example of a training set would be (XOR):
var myTrainingSet = [
{ input: [0,0], output: [0] },
{ input: [0,1], output: [1] },
{ input: [1,0], output: [1] },
{ input: [1,1], output: [0] }
];
There is no fixed rule of thumb for choosing your network architecture. Adding more layers makes your neural network recognize more abstract relationships, although it requires more computation. Any function can be mapped with just one (big) hidden layer, but I do not advise this. I advise to use any of the following architectures if you're a starter:
But for most problems, a perceptron is sufficient. Now you only have to determine the size and amount of the network layers. I advise you to take a look at this StackExchange question for more help on deciding the hidden size.
For the training set I provided above (XOR), there are only 2 inputs and one output. I use the rule of thumb: input size + output size = hidden size
. So the creation of my network would like this:
myNetwork = architect.Perceptron(2, 3, 1);
Finally: we're going to train the network. The function for training your network is very straight-forward:
yourNetwork.train(yourData, yourOptions);
There are a lot of options. I won't go over all of them here, but you can check out the Network wiki for all the options.
I'm going to use the following options:
log: 10
- I want to log the status every 10 iterationserror: 0.03
- I want the training to stop if the error is below 0.03iterations: 1000
- I want the training to stop if the error of 0.03 hasn't been reached after 1000 iterationsrate: 0.3
- I want a learning rate of 0.3So let's put it all together:
myNetwork.train(myTrainingSet, {
log: 10,
error: 0.03,
iterations: 1000,
rate: 0.3
});
// result: {error: 0.02955628620843985, iterations: 566, time: 31}
Now let us check if it actually works:
myNetwork.activate([0,0]); // [0.1257225731473885]
myNetwork.activate([0,1]); // [0.9371910625522613]
myNetwork.activate([1,0]); // [0.7770757408042104]
myNetwork.activate([1,1]); // [0.1639697315652196]
And it works! If you want it to be more precise, lower the target error.
If you need more help, feel free to create an issue here!