首先,我不得不说,这是我开始这门课程时做的最辛苦的练习。
1.3 Feedforward and cost function
1 2 3 4 5 6 7 8 9 10
| a1 = [ones(m, 1), X]; z2 = a1*Theta1'; a2 = [ones(m, 1), sigmoid(z2)]; z3 = a2*Theta2'; a3 = sigmoid(z3);
I = eye(num_labels); Y = I(y, :);
J = sum(sum((-Y.*log(a3) - (1-Y).*log(1-a3) ) / m));
|
目前,我对I(y,:)
还是一头雾水,怎么能让 y(5000:1)变成 Y(5000:10),每行都有匹配指数?
1.4 Regularized cost function
1 2
| r = lambda/2/m * (sum(sum(Theta1(:,2:end).^2)) + sum(sum(Theta2(:,2:end).^2))); J = J + r;
|
2.1 Sigmoid gradient
1
| g = sigmoid(z).*(1-sigmoid(z));
|
2.3 Backpropagation
$$\delta {k}^{(3)} = (a{k}^{(3)} - y_{k})$$
$$\dfrac{\partial }{\partial \Theta {ij}^{(l)}}J(\Theta ) = D{ij}^{(l)} = \dfrac{1}{m}\Delta _{ij}^{(l)}$$
1 2 3 4 5 6 7 8
| d3 = a3-Y; d2 = d3*Theta2.*[ones(m, 1), sigmoidGradient(z2)];
D1 = d2(:,2:end)'*a1; D2 = d3'*a2;
Theta1_grad = Theta1_grad + D1/m; Theta2_grad = Theta2_grad + D2/m;
|
2.5 Regularized Neural Networks
1 2
| Theta1_grad(:,2:end) = Theta1_grad(:,2:end) + lambda/m*Theta1(:,2:end); Theta2_grad(:,2:end) = Theta2_grad(:,2:end) + lambda/m*Theta2(:,2:end);
|
The hidden layer