ARTIFICIAL NEURAL NETWORKS

EE543 Questions on Chapter 6

by Ugur HALICI

 

 

Q1) Consider a perceptron for which u E R2 and x E {0,1}, let the desired output be 0 when elemens of class A={(2,4),(3,2),(3,4)} is applied as input and let it be 1 for the class B={(1,0),(1,2),(2,1)}. Answer the following questions by writing analytically the line described by the perceptron and the drawing it on two dimensional input space:

a)Does the line described by the perceptron with the initial connection weights w0=0, w1=1, w2=1 linearly separates the classes A and B

b)Let the learning rate h be 0.5. What will be the connection weights and the line described by the perceptron after applying u1=(1,2), then u2=(2,4) and then u3=(1,0) as input

c) Shall the perceptron convergence procedure terminate if the input patterns from class A and B are repeatedly applied by choosing a very small learning rate? Why?

d) Now add sample (5,2) to class B, what is your answer now? Why?

 

Solution

a)

b) x(k) = w0(k-1).u0(k) + w1(k-1).u1(k) + w2(k-1).u2(k)  where u0(k) is always 1

 

 

 

 

 

 

 

 

 

  

 

c) Because the classes are linearly separable, the perceptron convergence procedure will terminate if the learning rate is sufficiently small.

d)

Now  the classes are not linearly separable, so the perceptron convergence procedure will not terminate even though the learning rate is sufficiently small

 

Q2) Upon the presentation of the kth training pattern uk at step t, the weight vector w(t) is updated as:

w(t+1)=w(t)+(yk-xk).uk

where yis the desired output and  is the learning rate, which is a positive constant. If the set of training patterns are linearly separable, then prove that the perceptron training algorithm defined above converges to a correct solution

Hint: Let w* be a correct solution, which separates all patterns uk k=1..K correctly.  Show that at each step  || w(t+1)-w*|| is less than or equal to || w(t)-w*||  when is sufficiently small.

 

Q3) The 2:4:2 NN shown in the figure is to be trained with backpropagation algorithm to evalute 1's complement of the applied 2 bit binary number.

a) Write the expressions for errors and weight update formulas for the hidden layer and output layer neurons of this network for any output characteristic f(a),

b) Repeat  part a) if  the piecewise linear function shown in the figure is used as the output characteristic function.

c) Initially assume that  wji=0.1  for all input-hidden layer connections, and wil=0.2 for the hidden-output layer connections. Find out the numeric values for  the error terms and weight update by considering part b).

 

Q4) Instead of having a single connection  wji from neuron j to i with value negative or positive, consider a neuron model where a connection pair  each being nonnegative is to be used

.

Similarly instead of threshold q we have nonnegative threshold pair q+, q- and the neuron output characteristics are defined as

where

Derive  the error term and weight update formulas for backpropagation algorithm if  we have a multilayer feedforward network of  this kind of neurons.

 

Q5) Design a multilayer perceptron which will learn to recognize various forms of the the letters  {C,L,T} placed on a 3x3 grid through  backpropagation algorithm.  

 

a) Design a one layer network indicating what should be applied at the input layer and  what should be expected  at the output layer showing the number of neurons, the connections between them and the neuron’s output function

b) repeat (a) for two layer network by  adding a hidden layer

 

Q6) Given the extended delta rule   Dwji=h(yi-xi) uj f (ai)  where h s the learning rate, yi is the desired output,  xis the actual output of neuron i,  uj  is  input  j  and f (ai) is the derivative of the output function evaluated at the current activation value ai. Assume f(a) is chosen to be sigmoid function, that is f(a)=1/(1+e-ka).

a) explain under which conditions wji decreases/increases

b) Find the vector Dwi when h=0.1, yi=1, uT=(-1,1,1), and the current value of the weight vector is wiT(-3,-5,2)

 

Q7) Consider an N:H:1 feedforward network with neurons having sigmoid activation function

 f(a)=(1/(1+exp(-a/T)))

a) What is f ’(a) ?

b) Draw f ’(a) with respect to  a

c)Suppose that you want to apply backpropagation algorithm. What is the problem if the f(a) for  an output neuron is very close to 1 but the desired output is 0; or vice versa.?

 

Q8) Consider a perceptron for which u<ÎR2 and

                            

Let the desired output be  1 when elemens of class A = {(1,2),(2,4),(3,3),(4,4)} is applied as input and let it be -1 for the class B = {(0,0),(2,3),(3,0),(4,2)}. Let the initial connection weights w0(0) = +1, w1(0) = -2, w2(0) = +1 and learning rate be h = 0.5.

This perceptron is to be trained by perceptron convergence procedure, for which the weight update formula is

w(t+1)=w(t)+ (yk-xk(t)) uk

a)     i) Mark the elements belonging to class A with x and those belonging to class B with o  on input space.

ii) Draw the line represented by the perceptron considering the initial connection weights w(0).

iii) Find out the regions for which the perceptron output is +1 and –1

iv) Which elements of A and B are correctly classified, which elements are misclassified and which are unclassified?

b) If u=(4,4) is applied at input, what will be w(1) ?
c) Repeat a) considering w(1).

d) If u=(4.2) is then applied at input, what will be w(2)?

e) Repeat a) considering w(2).

f) Do you expect the perceptron convergence procedure to terminate? Why?

 

Solution: The line defined by the weights of the perceptron is given by the formula

w0u0+w1u1+w2u2=0

a)Perceptron line is  1+2u1+u2=0

The data points

(1,2),(2,4),(3,0),(4,2) are classified correctly

(3,3),(4,4),(0,0) are misclassified

(2,3) is unclassified

b)

u=(4,4)ÎÞ  y=+1

c)        2+2u1+5u2=0

 Data points

(1,2),(2,4),(3,3),(4,4) are classified correctly,

(0,0),(2,3),(3,0),(4,2) are misclassified.

d)       u2= (4,2)ÎB Þ y2= -1

 

e)   1-2u1+3u2=0

 

Data points

 (1,2),(2,4),(3,3),(4,4),(3,0),(4,2) are classified correctly,

(0,0),(2,3) are misclassified.

f) The perceptron convergence procedure will not terminate since the classes A and B are not linearly separable.