
With the errors, backpropagation works backwards to update the network weights on the direction of gradients of the error. First, we need to compute the deltas of the weights and biases. Note that  is used to update  and , andis used to update and :

This is written in TensorFlow code as follows: 

d_z_2 = tf.multiply(error, sigmoidprime(z_2))
d_b_2 = d_z_2
d_w_2 = tf.matmul(tf.transpose(a_1), d_z_2)

d_a_1 = tf.matmul(d_z_2, tf.transpose(w_2))
d_z_1 = tf.multiply(d_a_1, sigmoidprime(z_1))
d_b_1 = d_z_1
d_w_1 = tf.matmul(tf.transpose(a_0), d_z_1)