Math Problem Statement
is my question 1 correct Definition w_jk=the weight between the hidden neuron of k and the output neuron of j y_j=the output neuron of j which calculates as f_out=f(〖net〗_j )=f(∑_k▒〖w_jk y_j 〗) Objective Function of E=c^2/2 log_e〖(((t_i-y_i)/c+b)^2 )+∑_k^w▒((w_k^2)/θ)/(1+(w_k^2)/θ)〗 η is the learning rate In respect to w_jk, the calculation of the derivative of E δE/(δw_jk )= δE/(δy_j ) * (δy_j)/(δ〖net〗_j ) * (δ〖net〗_j)/(δw_jk )
calculating δE/(δy_j ) E(y_i )=c^2/2 log_e〖(((t_i-y_i)/c+b)^2 )+∑_k^w▒((w_k^2)/θ)/(1+(w_k^2)/θ)〗 The log〖 property〗 of log〖(x^2 ) is 2 log(x) 〗 E(y_i )=c^2/22log_e〖((t_i-y_i)/c+b)+∑_k^w▒((w_k^2)/θ)/(1+(w_k^2)/θ)〗 E(y_i )=c^2*log_e〖((t_i-y_i)/c+b)+∑_k^w▒((w_k^2)/θ)/(1+(w_k^2)/θ)〗
Let’s differentiate in terms of y_j Term 1: c^2log_e((t_i-y_i)/c+b) c^2 is a constant so it will stay the same let (t_i-y_i)/c+b be x ∂/(∂y_J )=c^2log_ex =c^21/x∂x/(∂y_j ) =c^21/x((0-1)/c+0) =c^21/x-1/c =〖-c〗^2/c1/x =(-c)/11/x =-c/x remember that x is (t_i-y_i)/c+b =-c/((t_i-y_i)/c+b) To simplify my answer: =-c/((t_i-y_i)/c+(bc)/(c1)) =-c/((t_i-y_i+bc)/c) Using KCF (Keep the Numerator, Change the Sign, Flip the Fraction) =-c/1*c/(t_i-y_i+bc) =-c^2/(t_i-y_i+bc)
Term 2: ∑_k^w▒((w_k^2)/θ)/(1+(w_k^2)/θ) As there is no instance of y_j, the derivate will be 0 ∂/(∂y_J )=0 Therefore: ∂E/(∂y_J )= -c^2/(t_i-y_i+bc) calculating (δy_J)/(δ〖net〗_j ) Output Layer f(x)=x/(1+|x| )
let x be 〖net〗(i^out ) using the quotient rule of (u^' v-v^' u)/v^2 (δy_J)/(δ〖net〗j )=f^' (net(i^out ) )=((〖net〗(i^out )'*(1+n〖et〗(i^out ) ))-((1+net(i^out ) )^'*net_(i^out ) ))/(1+net_(i^out ) )^2
(δy_J)/(δ〖net〗j )=f^' (net(i^out ) )=((1*(1+n〖et〗(i^out ) ))-((0+1)*net(i^out ) ))/(1+net_(i^out ) )^2
(δy_J)/(δ〖net〗j )=f^' (net(i^out ) )=(n〖et〗(i^out )+1-n〖et〗(i^out ))/(1+net_(i^out ) )^2 (δy_J)/(δ〖net〗j )=f^' (net(i^out ) )=1/(1+net_(i^out ) )^2
calculating (δ〖net〗_j)/(δw_jk ) (δ〖net〗_j)/(δw_jk )=y_k
δE/(δw_jk )= δE/(δy_J ) * (δy_J)/(δ〖net〗_j ) * (δ〖net〗_j)/(δw_jk )
δE/(δw_jk )=(-c^2/(t_i-y_i+bc)) (1/(1+net_(i^out ) )^2 ) * y_k δE/(δw_jk )=((-c^2 y_k )/〖〖(t〗i-y_i+bc)*(1+net(i^out ) )〗^2 ) δE/(δw_jk )=(-c^2* y_k )/〖〖(t〗i-y_i+bc)*(1+net(i^out ) )〗^2
w_jk^new= w_jk^old-ηδE/(δw_jk ) =w_jk^old-η(-c^2 y_k )/〖〖(t〗i-y_i+bc)*(1+net(i^out ) )〗^2 =w_jk^old+η (c^2 y_k )/〖〖(t〗i-y_i+bc)*(1+net(i^out ) )〗^2
Solution
Ask a new question for Free
By Image
Drop file here or Click Here to upload
Math Problem Analysis
Mathematical Concepts
Chain Rule
Quotient Rule
Logarithmic Differentiation
Activation Functions
Formulas
Objective Function E
Derivative of E with respect to y_j
Derivative of y_j with respect to net_j
Derivative of net_j with respect to w_jk
Theorems
-
Suitable Grade Level
Advanced Mathematics
Related Recommendation
Derive Update Rule for Non-Bias Weights: Neural Networks Tutorial
Computing \( \hat{y} \) and Loss for Neural Networks with SGD and Cross-Entropy Loss
Understanding Equations and Definitions in Neural Network Architecture
Derive the Optimal Step Size for Gradient Descent at Iteration k
Perceptron Weight Update Rule with Gradient Descent