[ieee 2013 21st telecommunications forum telfor (telfor) - belgrade, serbia (2013.11.26-2013.11.28)]...

4
Errors correction with optimised Hopfield neural networks Constantin Anton, Laurenţiu Ionescu, Alin Mazăre, *Ion Tutănescu, Gheorghe Şerban Electronics, Computers and Electrical Engineering Faculty - University of Pitesti, Romania {constantin.anton, laurentiu.ionescu, ion.tutanescu, alin.mazare, gheorghe.serban}@upit.ro * Corresponding author Abstract – We present in this paper a method of increasing both the storage capacity of Hopfield neural networks and their capability of error correction. The presented method uses the general principles of generating error-correcting codes in information theory combined with a gradient – an heuristic algorithm. Using this method, there are registered improvements in the growth of network storage capacity (number of words memorized), and also in the increase of the correct answers’ probability. These types of networks can be used in several applications which use associativity, including the correction errors in communication, the image reconstruction and the object recognition. Keywords - Hopfield neural networks, error correction, associativity. I. INTRODUCTION Hopfield neural networks can be used in many applications including image reconstruction, the classification of objects or error correction. They allow the implementation of associativity, a process that takes place in the living world. The association is an effective method of accessing data from memory. A common memory, such as that found in the computer, is accessed by generating an address. If we are looking for something data, an address is generated, the location is read and is compared with the searched data. This is repeated until we finally get the sought data. An associative memory, such as that based on Hopfield network, allows for an association between the input data, participating to the search, and data found. This can be used in error correction, where input data represent the received word, potentially with errors, from communication channel and data found represent the correct data. Hopfield networks have problems with storage capacity. Thus, must be a large number of neurons that allow more data storage. But in common binary representations, a neuron represent one bit from entire word, which is the network. In these conditions, more neurons mean large words. Our goal is to improve network performance without increasing the number of neurons. Research has been done on improving the storage capacity by using evolved learning algorithms, by establishing nonlinear activation function of each neuron and generates the weight array by different methods. The solution proposed in this paper brings three modifications which improve network performance: at a time is updated only a single neuron, neuron to be updated is selected randomly and the words stored in the network are chosen by a rule establishing minimum Hamming distance. The following sections develop these ideas. Chapter 2 presents a mathematical model of Hopfield network. Chapter 3 provides network optimization solutions: heuristic gradient method to update network status and type of words that can be stored and which increase storage capacity of the network. In Chapter 4 are presented the experimental results obtained with a neural network hardware implemented. Chapter 5 presents our conclusion II. HOPFIELD NEURAL NETWORKS A Hopfield network is composed of neurons which are all interconnected. It follows a bidirectional associative memory (BAM) memory model, where the state is updated in one step from previous states. By memory state we understand all its neurons outputs. So, memory state updates based on previous state. The update process lasts until to reach equilibrium, when the memory state changes no longer (no change or change falls below an acceptable threshold) [1]. First, the state of each neuron is given by the weighted sum of inputs to which is applied a nonlinear threshold function. ݔ ൫∑ ݔ ݓ (1) where: ϕ is a nonlinear threshold function, called activation function, x j - represents the state of another neuron for previous time (previous step) w ij - is the weight of connection between x i and x j . For the bipolar coding method we use a nonlinear activation function of the following form: ݔሻൌ൜ 1 ݔ ߠ1 ݔ ߠ(2) where θ is a coefficient called threshold. 21st Telecommunications forum TELFOR 2013 Serbia, Belgrade, November 26-28, 2013. 978-1-4799-1420-3/13/$31.00 ©2013 IEEE

Upload: gheorghe

Post on 27-Feb-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2013 21st Telecommunications Forum Telfor (TELFOR) - Belgrade, Serbia (2013.11.26-2013.11.28)] 2013 21st Telecommunications Forum Telfor (TELFOR) - Errors correction with optimised

Errors correction with optimised Hopfield neural networks

Constantin Anton, Laurenţiu Ionescu, Alin Mazăre, *Ion Tutănescu, Gheorghe Şerban Electronics, Computers and Electrical Engineering Faculty - University of Pitesti, Romania {constantin.anton, laurentiu.ionescu, ion.tutanescu, alin.mazare, gheorghe.serban}@upit.ro

* Corresponding author

Abstract – We present in this paper a method of increasing both the storage capacity of Hopfield neural networks and their capability of error correction. The presented method uses the general principles of generating error-correcting codes in information theory combined with a gradient – an heuristic algorithm. Using this method, there are registered improvements in the growth of network storage capacity (number of words memorized), and also in the increase of the correct answers’ probability. These types of networks can be used in several applications which use associativity, including the correction errors in communication, the image reconstruction and the object recognition.

Keywords - Hopfield neural networks, error correction, associativity.

I. INTRODUCTION Hopfield neural networks can be used in many applications

including image reconstruction, the classification of objects or error correction. They allow the implementation of associativity, a process that takes place in the living world. The association is an effective method of accessing data from memory. A common memory, such as that found in the computer, is accessed by generating an address. If we are looking for something data, an address is generated, the location is read and is compared with the searched data. This is repeated until we finally get the sought data. An associative memory, such as that based on Hopfield network, allows for an association between the input data, participating to the search, and data found. This can be used in error correction, where input data represent the received word, potentially with errors, from communication channel and data found represent the correct data.

Hopfield networks have problems with storage capacity. Thus, must be a large number of neurons that allow more data storage. But in common binary representations, a neuron represent one bit from entire word, which is the network. In these conditions, more neurons mean large words. Our goal is to improve network performance without increasing the number of neurons.

Research has been done on improving the storage capacity by using evolved learning algorithms, by establishing nonlinear activation function of each neuron and generates the weight

array by different methods. The solution proposed in this paper brings three modifications which improve network performance: at a time is updated only a single neuron, neuron to be updated is selected randomly and the words stored in the network are chosen by a rule establishing minimum Hamming distance. The following sections develop these ideas.

Chapter 2 presents a mathematical model of Hopfield network. Chapter 3 provides network optimization solutions: heuristic gradient method to update network status and type of words that can be stored and which increase storage capacity of the network. In Chapter 4 are presented the experimental results obtained with a neural network hardware implemented. Chapter 5 presents our conclusion

II. HOPFIELD NEURAL NETWORKS A Hopfield network is composed of neurons which are all

interconnected. It follows a bidirectional associative memory (BAM) memory model, where the state is updated in one step from previous states. By memory state we understand all its neurons outputs. So, memory state updates based on previous state. The update process lasts until to reach equilibrium, when the memory state changes no longer (no change or change falls below an acceptable threshold) [1]. First, the state of each neuron is given by the weighted sum of inputs to which is applied a nonlinear threshold function.

∑ (1)

where: ϕ − is a nonlinear threshold function, called activation function, xj - represents the state of another neuron for previous time (previous step) wij - is the weight of connection between xi and xj. For the bipolar coding method we use a nonlinear activation function of the following form:

1 1 (2)

where θ is a coefficient called threshold.

21st Telecommunications forum TELFOR 2013 Serbia, Belgrade, November 26-28, 2013.

978-1-4799-1420-3/13/$31.00 ©2013 IEEE

Page 2: [IEEE 2013 21st Telecommunications Forum Telfor (TELFOR) - Belgrade, Serbia (2013.11.26-2013.11.28)] 2013 21st Telecommunications Forum Telfor (TELFOR) - Errors correction with optimised

The weights array, called W, contains all connection weights wij between neurons. Weight array is equivalent with characteristic array (A) for a linear differential equation with n unknown values, where n is the number of neurons, that can be written as:

W=λX (3) where: X - is the column array of neurons (of their status, xi), W - is weights array wij (connections between xi and xj), λ - is a scalar value. The solution of this equation means reaching equilibrium conditions, so to the next step get all the same network status. This solution depends on W. W array has the following properties:

1. wij = wji i,j i≠j, 2. wii = 0.

In other words, the connection between two neurons is the same, whether you look at a neuron or to another, and one neuron is not connected with itself. For example, in a network with 5 neurons (see Figure 1) we have W array with the main diagonal symmetric, which contains 10 distinct non zero members. They are represented using wij notation. The weight value w12 is the same with w12 and it is represented once in the figure.

Figure 1 - Fully interconnected 5- neurons network.

To build the W array, so that the system to reach equilibrium, it is established a Hebian learning rule which aims biological model of the state of connections between neurons [2]. This rule is illustrated in the following equation:

1 1 1 1 (4)

So, the weight between two neurons increases if their state is the same and decreases if their state is different. In order to train the network with k patterns (which are intended to be learned) With this rule, the weight is calculated as follows: ∑ (5)

where xi and xj are the states of connected neurons. Each pattern stored will be an equilibrium point for the network. The network will move from other states in steady state. Characterization of the convergence of the network is done by using a parameter called energy [3]. Its value is given by:

E=- 1

2∑ xixjwij+i≠j ∑ xiθii (6)

where: xi, xj - are neurons states; wij - is the weight of connection between xi and xj; θi - is a threshold for nonlinear function.

III. SOLUTION FOR INCREASE NETWORK CAPACITY Mathematical apparatus built around this type of network, along with positive features, shows also the problems that are raised by network. The first problem is the relatively low storage capacity. The arithmetic expression (7) shows the relationship between storage capacity (number of random strings which may be stored on the network) and the number of neurons that are contained by the network, both to classic network model [4] and more recently derived [5]:

m = 0.18 n (7)

where m is the number of random words (bit strings) which can be stored and n represents the network size (the number of neurons). Expression is given for words with any form. But if we store some special words (orthogonal vectors, words with a fixed minimum Hamming distance between them etc.), the storage capacity increases significantly, as we have shown in this paper. Along with selecting the information that is stored, there are other storage solutions for improving network capacity. An improvement can be made by introducing a random element in the network balancing algorithm (Boltzmann machine) [6]. Random element in network status update brings the following improvements: - provide the escape from a local "potential well" with a certain probability; - ensure the diversification of network response of course with certain likelihood. The random element makes improvements in search algorithms in the space of solutions, such as genetic algorithms or feed-forward neural networks. And there, as here, is used a combination of a deterministic response and heuristic search (gradient - heuristic methods). So, the proposed solution (see Figure 2) brings the following improvements: - can escape from an area of oscillation or a local minimum where the network tends to reach a point. - optimizes the response and thus increase the storage capacity. The only state in which the network can’t escape is the equilibrium state. Unlike other algorithms, which combine the deterministic and heuristic methods, here is no need at a time to stop the convergence process (evolution). Therefore, we don’t have circuits which compare if the solutions are correct or not. Based on the result, a decision should be taken whether or not convergence process stops, because, in our case, the escape of equilibrium points is not possible under any circumstances.

Page 3: [IEEE 2013 21st Telecommunications Forum Telfor (TELFOR) - Belgrade, Serbia (2013.11.26-2013.11.28)] 2013 21st Telecommunications Forum Telfor (TELFOR) - Errors correction with optimised

The network storage capacity limitation was taken into account to the selection of the types of words stored and tested. A binary word and its inverse are stored together, so there are memorized two words in a single step. This feature comes from the bipolar representation of data located in a neuron. Even if we talk about binary strings of the form 101010... which are stored in the network, a bit for each neuron, a neuron state 0 is actually coded by -1 [7].

In conclusion, the product of the outputs for two neurons, both in same state, is 1. This doubles the storage capacity of the network, because the "noise" is generated when saving a single word while actually we store two words.

Figure 2 - Optimized error correction module.

When testing the network, the test words should not be at

the same Hamming distance to two or more stored words. IV. EXPERIMENTAL RESULTS

In order to analyze the network performance we proceeded as follows. In the hardware implemented network on the chip (see Figure 3) were stored 2 or 4 words. “Stored” means learning procedure - in our solution “learning” occurs in a single clock generation. After learning phase, we generated first 10 test-validation words with Hamming distance 1 from the stored words and 10 validation testing words with Hamming distance 2 from the stored words. We counted the number of correct answers that the network gives, obtaining a certain rate of correct answers. Test-validation words are placed to the input of the network received from a wireless communication channel with some noises induced. Thus, we obtain test – validation words with 1 or 2 erroneous bits.

Figure 3 - Experimental system for measuring the network performance.

The words were stored in the form of pairs: a word and its inverse. Process of storage of a word and its inverse takes place in one step, because of bipolar coding scheme that we have chosen to represent the state of neurons and because of the type of learning. When are stored 4 words (2 pairs) they will be chosen so that the Hamming distance of each other to be at least 2. The testing-validation words were chosen in order to generate differences of 1 bit or 2 bits to the words stored in the network, to see how the network responds. Place of change (bit changed) was determined randomly. The results are presented in Table I and in Table II, illustrating the percentage of correct responses.

TABLE I. THE NETWORK RESPONSE - 2 WORDS STORED

dH 5

neurons 6

neurons 7

neurons 8

neurons 9

neurons 10

neurons 1 100% 100% 100% 100% 100% 100% 2 70% 100% 100% 100% 100% 100%

TABLE II. THE NETWORK RESPONSE – 4 WORDS STORED

dH 5 neurons

6 neurons

7 neurons

8 neurons

9 neurons

10 neurons

1 71% 76% 79% 85% 100% 100% 2 - 56% 70% 73% 85% 88% The second line, first column shows that there is not possible as having 4 words, on 5 bits size each, to have another word with two bits different of one of the 4 words and more than two bits different from the rest. To make a real comparison we took the results obtained in [1] in the configuration of classical Hopfield network with 10 neurons. The associative Hopfield network presented by Rojas is model that we've gone and has been illustrated as a theoretical apparatus in the previous chapter. Weight matrix is determined basically by Hebb's rule, the neurons state encoding scheme is bipolar. It can be seen from Table III that our 8 neurons network is comparable, to the performance, with the 10 neuron network presented by Rojas.

TABLE III. COMPARISON OF RESULTS: MODEL ROJAS 10 NEURONS - OUR MODEL (6..10 NEURONS) THE PROBABILITY OF CORRECT RESPONSE TO TEST

WORDS WITH DH = 2

Model Stored

10 neurons Rojas’model

6 neurons

7 neurons

8 neurons

9 neurons

10 neurons

1 word

100% 100% 100% 100% 100% 100%

2 words

100% 100% 100% 100% 100% 100%

3 words

72.6% 56% 70% 73% 85% 88%

4 words

71.1% 56% 70% 73% 85% 88%

Because, in our solution, memorize pairs of words, we have the same probability of correct response to 1 or 2 words, also 3 or 4 words. As it can be seen, our improved model storage capacity is comparable to that from [1] to a larger number of neurons.

Page 4: [IEEE 2013 21st Telecommunications Forum Telfor (TELFOR) - Belgrade, Serbia (2013.11.26-2013.11.28)] 2013 21st Telecommunications Forum Telfor (TELFOR) - Errors correction with optimised

In conclusion, Table IV represents the estimated storage capacity for three types of neural networks with 10 neurons each: classic Hopfield (deduced from 0.18n), Rojas and our model:

TABLE IV. QUALITY INDICATOR OF NEURAL NETWORKS (ESTIMATED STORAGE CAPACITY)*:

Hopfield classic Rojas (5 iteration)

Our model

Storage capacity (10 neurons network)

1.8 2.844 3.52

* Determination was made for the experimental results with 4 words memorized and tests words with dH = 2.

Figure 4 shows a comparison between the Hopfield network model analyzed by Rojas and our solutions with a number of neurons from 6 to 10.

a) Test words with dH = 1 from stored words

b) Test words with dH = 2 from stored words

Figure 4 - Comparative performance analysis for different Hopfield networks.

In Figure 4, the number of stored words is compared with the number of words correctly reproduced. Optimum solution is reached when the ratio is linear, on main diagonal of

coordinate system: x words stored and x words correctly reproduced in the test. In Figure 4a notice that our network solutions with 6 and 10 neurons achieve the optimum. Figure 4b illustrates the network response when the test words are received with Hamming distance 2 from the stored words. Here you can see the superior performance of our networks with 10 and 6 neurons.

V. CONCLUSIONS Our solution increase Hopfield network storage capacity. This is achieved by using a gradient – heuristic solution for update network state. A single neuron is updated at a time, using a deterministic calculation (from the biological neuron model) but its selection is performed in a random way. Also, if we choose certain words that can be stored in the network then we increase its storage capacity and errors correction ability. Once stored a word, the network can associate with it other received words (placed to the input) with same structure but with erroneous bits. Thus, our system can do a local error correction. Experiments from this paper use small networks tested with words transmitted using wireless communication channel. This allows correction for words with size of maximum 10 bits. As future trends, we perform experiments with larger networks.

REFERENCES [1] R. Rojas “Neural networks. A systematic introduction”, Springer Verlag,

Berlin, 1996; [2] McMillen, T., Simen, P., Behseta, S., Hebbian learning in linear-

nonlinear networks with tuning curves leads to near-optimal, multi-alternative decision making, Neural Networks 24 (5) , pp. 417-426, 2011;

[3] Calabuig, D., Monserrat, J.F., Cardona, N., Convergence and stability of quantized hopfield networks operating in a fully parallel mode, Neural Computation 22 (10) , pp. 2507-2521, 2010;

[4] Morais, C.V., Zimmer, F.M., Magalhaes, S.G., “Inverse freezing in the Hopfield fermionic Ising spin glass with a transverse magnetic field”, Physics Letters, Section A: General, Atomic and Solid State Physics, Volume 375, Issue 3, 17 January 2011, Pages 689-697, 2011;

[5] Hertz, J., A. Krogh, and R. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, Redwood City, CA.,1991;

[6] Löwe, M., Vermet, F., "The storage capacity of the Hopfield model and moderate deviations", Elsevier, Statistics and Probability Letters,Volume 75, Issue 4, 15 December 2005, Pages 237–248, 2005;

[7] Barra, A., Bernacchia, A., Santucci, E., Contucci, P., On the equivalence of Hopfield networks and Boltzmann Machines, Neural Networks 34, pp. 1-9, 2012;

[8] Abe, H., Osana, Y., Kohonen feature map associative memory with area representation, Proceedings of the IASTED International Conference on Artificial Intelligence and Applications, AIA 2006 , pp. 250-255, 2006.

012345

1 2 3 4

Corr

ect a

nsw

er

Words stored

Rojas 10 neuro

6 neuro

7 neuro

8 neuro

0

1

2

3

4

1 2 3 4

Corr

ect a

nsw

er

Words stored

Rojas 10 neuro

6 neuro

7 neuro

8 neuro