Yeah .. but his first impulse was to write backprop from scratch. I saw the lect...

jorgemf · on March 5, 2018

> his first impulse was to write backprop from scratch

backprop is a very simple algorithm, nothing to fear there. The problems are to calculate the derivates if you want to be flexible building your model. But for feedforward networks with sigmoid activation, the equations to update the weights are a joke.

neiled · on March 5, 2018

I'm not sure we agree on the definition of very simple. It maybe very simple if you already know it...

shbm · on March 5, 2018

https://m.youtube.com/watch?v=i94OvYb6noo&feature=youtu.be&t...

You won't regret. One of the best explanation of backprop on the internet.

jorgemf · on March 5, 2018

Backpropagation is based on linear optimization (aka calculate the maximum or minimum of a function based on the derivades of that function, this is taught before university in my country). And also in the chain rule to calculate the derivades of functions (first year of university).

But I meant, if you see the equations and the steps without understanding completely the insights, it is a joke of an algorithm. It just does some multiplications and applies the new gradients, move to the previous layer and repeat.

test6554 · on March 5, 2018

It's been a while, but I remember backprop starting at the end of the neural net, and working backwards. Each weight that contributed to a wrong answer had its weight value weakened or even reversed by some small factor. And each weight that contributed to a correct answer had its weight value strengthened by some small factor.

So it's probably as simple as

newWeight = oldWeight +/- (stepValue * someFactor)

jorgemf · on March 5, 2018

exactly, the someFactor is the error of the next layer calculated with the derivades of the functions (as you would calculate the minimum of a funtion using the derivades). The tricky part is to calculate the derivades, but since auto differentiation we can do a lot of cool stuff.

tostitos1979 · on March 5, 2018

The differentiation is probably why I wouldn't have bothered to hack it myself. Curious how you/others would tackle it? What do you mean by auto differentiation?

jorgemf · on March 5, 2018

https://en.wikipedia.org/wiki/Automatic_differentiation