Backpropagation

In the context of neural networks, backpropagation is a method for computing the gradients of the loss function with respect to the weights of the neural network, and gradient descent (or its variants) then uses these gradients to update the weights by taking a small step in the direction of the negative gradient, with the goal to arrive at a minimum.

One neuron per layer

From this video
Pasted image 20250504100849.png|300

C0=(a(L)y)2a(L)=σ(z(L))z(L)=w(L)a(L1)+b(L)C0w(L)=2(a(L)y)σ(z(L))a(L1)

Two neurons per layer

C0=(a1(L)y1)2+(a2(L)y2)2ai(L)=σ(zi(L))z1(L)=w11(L)a1(L1)+w12(L)a2(L1)+b1z2(L)=w21(L)a1(L1)+w22(L)a2(L1)+b2z(L)=W(L)A(L1)+B

To compute

C0w12(L),

we need to follow the chain rule for derivatives through the layers of the network.

C0w12(L)=2(a1(L)y1)σ(z1(L))a2(L1)