Neural networks for regression - a comprehensive overview - Part 4
After covering the intricate details of the backpropagation algorithm, we now move on to practical application by implementing a neural network for regression in C#.
In this post, we will build on what we covered in the previous post and demonstrate how to code a neural network. We'll aim to keep the explanation as concise as possible.
Defining interfaces
In this section, we define a few interfaces that we will need to implement. These interfaces can also be extended by custom classes in the future for greater flexibility and scalability.
Defining activation functions
Activation functions are a key component of neural networks that introduce non-linearity into the model, allowing it to learn and model complex patterns in the data. Activation functions determine the output of a neuron based on its input.
They play a role both in their natural form and through their derivatives, as derivatives are essential for calculating gradients during the backpropagation process. Therefore, we will define the following contract (interface) for them.
1public interface IActivationFunction
2{
3 double Evaluate(double input);
4
5 double EvaluateDerivative(double input);
6}
An example of an activation function is the $tanh$ (hyperbolic tangent) function.
1public class TanhActivationFunction : IActivationFunction
2{
3 public double Evaluate(double input)
4 {
5 return Math.Tanh(input);
6 }
7
8 public double EvaluateDerivative(double input)
9 {
10 return 1 - Math.Pow(Math.Tanh(input), 2);
11 }
12}
Since we are also working with regression, we will need the identity activation function.
1public class IdentityActivationFunction : IActivationFunction
2{
3 public double Evaluate(double input)
4 {
5 return input;
6 }
7
8 public double EvaluateDerivative(double input)
9 {
10 return 1.0;
11 }
12}
Defining algorithms for training
For a neural network to be effective, we need to determine the weights that minimize the cost function. To accomplish this, various techniques can be used, and we can define the following contract to guide the process.
1public interface IANNTrainer
2{
3 void Train(ANNForRegression ann, DataSet set);
4}
We will now implement a gradient descent algorithm, using the derivatives computed through the backpropagation algorithm (as discussed in the previous post).
1public class GradientDescentANNTrainer : IANNTrainer
2{
3 private ANNForRegression _ann;
4
5 public void Train(ANNForRegression ann, DataSet set)
6 {
7 _ann = ann;
8
9 Fit(set);
10 }
11
12 #region Private Methods
13
14 private void Fit(DataSet set)
15 {
16 var numberOfHiddenUnits = _ann.NumberOfHiddenUnits;
17
18 var a = new double[numberOfHiddenUnits];
19 var z = new double[numberOfHiddenUnits];
20 var delta = new double[numberOfHiddenUnits];
21
22 var nu = 0.1;
23
24 // Initialize
25 var rnd = new Random();
26 for (var i = 0; i < _ann.NumberOfFeatures; i++)
27 {
28 for (var j = 0; j < _ann.NumberOfHiddenUnits; j++)
29 {
30 _ann.HiddenWeights[j, i] = rnd.NextDouble();
31 _ann.HiddenBiasesWeights[j] = rnd.NextDouble();
32 }
33 }
34
35 for (var j = 0; j < numberOfHiddenUnits; j++)
36 _ann.OutputWeights[j] = rnd.NextDouble();
37
38 _ann.OutputBiasesWeights = rnd.NextDouble();
39
40 for (var n = 0; n < 10000; n++)
41 {
42 foreach (var record in set.Records)
43 {
44 // Forward propagate
45 z[0] = 1.0;
46 for (var j = 0; j < _ann.NumberOfHiddenUnits; j++)
47 {
48 a[j] = 0.0;
49 for (var i = 0; i < _ann.NumberOfFeatures; i++)
50 {
51 var feature = set.Features[i];
52 a[j] = a[j] + _ann.HiddenWeights[j, i]*record.Data[feature];
53 }
54
55 // Add biases
56 a[j] = a[j] + _ann.HiddenBiasesWeights[j];
57
58 z[j] = _ann.HiddenActivationFunction.Evaluate(a[j]);
59 }
60
61 var b = 0.0;
62 for (var j = 0; j < numberOfHiddenUnits; j++)
63 b = b + _ann.OutputWeights[j] * z[j];
64
65 b = b + _ann.OutputBiasesWeights;
66
67 var y = b;
68
69 // Evaluate the error for the output
70 var d = y - record.Target;
71
72 // Backpropagate this error
73 for (var j = 0; j < numberOfHiddenUnits; j++)
74 delta[j] = d * _ann.OutputWeights[j] * _ann.HiddenActivationFunction.EvaluateDerivative(a[j]);
75
76 // Evaluate and utilize the required derivatives
77 for (var j = 0; j < numberOfHiddenUnits; j++)
78 _ann.OutputWeights[j] = _ann.OutputWeights[j] - nu * d * z[j];
79
80 _ann.OutputBiasesWeights = _ann.OutputBiasesWeights - nu * d;
81
82
83 for (var j = 0; j < numberOfHiddenUnits; j++)
84 {
85 for (var i = 0; i < _ann.NumberOfFeatures; i++)
86 {
87 var feature = set.Features[i];
88 _ann.HiddenWeights[j, i] = _ann.HiddenWeights[j, i] - nu * delta[j]*record.Data[feature];
89 }
90
91 _ann.HiddenBiasesWeights[j] = _ann.HiddenBiasesWeights[j] - nu * delta[j];
92 }
93 }
94 }
95 }
96
97 #endregion
98}
Defining the neural network
With these interfaces defined, implementing a neural network becomes a fairly straightforward task.
1public class ANNForRegression
2{
3 public double[,] HiddenWeights { get; set; }
4
5 public double[] HiddenBiasesWeights { get; set; }
6
7 public double[] OutputWeights { get; set; }
8
9 public double OutputBiasesWeights { get; set; }
10
11 public int NumberOfFeatures { get; set; }
12
13 public int NumberOfHiddenUnits { get; set; }
14
15 public IActivationFunction HiddenActivationFunction { get; set; }
16
17 public IANNTrainer Trainer { get; set; }
18
19 public ANNForRegression(int numberOfFeatures, int numberOfHiddenUnits, IActivationFunction hiddenActivationFunction, IANNTrainer trainer)
20 {
21 NumberOfFeatures = numberOfFeatures;
22 NumberOfHiddenUnits = numberOfHiddenUnits;
23 HiddenActivationFunction = hiddenActivationFunction;
24 Trainer = trainer;
25
26 HiddenWeights = new double[NumberOfHiddenUnits, NumberOfFeatures];
27 HiddenBiasesWeights = new double[NumberOfHiddenUnits];
28 OutputWeights = new double[NumberOfHiddenUnits + 1];
29 }
30
31 public void Train(DataSet set)
32 {
33 Trainer.Train(this, set);
34 }
35
36
37 public double Predict(DataToPredict record)
38 {
39 var a = new double[NumberOfHiddenUnits];
40 var z = new double[NumberOfHiddenUnits];
41
42 // Forward propagate
43 z[0] = 1.0;
44 for (var j = 0; j < NumberOfHiddenUnits; j++)
45 {
46 a[j] = 0.0;
47 for (var i = 0; i < NumberOfFeatures; i++)
48 {
49 var data = record.Data.ElementAt(i);
50 a[j] = a[j] + HiddenWeights[j, i] * data.Value;
51 }
52
53 // Add biases
54 a[j] = a[j] + HiddenBiasesWeights[j];
55
56 z[j] = HiddenActivationFunction.Evaluate(a[j]);
57 }
58
59 var b = 0.0;
60 for (var j = 0; j < NumberOfHiddenUnits; j++)
61 b = b + OutputWeights[j] * z[j];
62
63 b = b + OutputBiasesWeights;
64
65 return b;
66 }
67}
This code includes two notable methods: Train and Predict. The Train method allows us to train the neural network using a training algorithm, while the Predict method enables us to make predictions on previously unseen values.
That's enough about the code for now. It's time to see it in action, and we will now explore how a neural network can approximate any function we desire.
Neural networks for regression - a comprehensive overview - Part 5