Ido Rosen points us to this interesting and detailed post by Andrej Karpathy, “A Recipe for Training Neural Networks.” It reminds me a lot of various things that Bob Carpenter has said regarding the way that some fitting algorithms are often oversold because the presenters don’t explain the tuning that was required to get good answers. Also I like how Karpathy presents things; it reminds me of Bayesian workflow.
The only thing I’d add is fake-data simulation.
I’m also interested in the ways that deep-learning workflow differs, or should differ, from our Bayesian workflow when fitting traditional models. I don’t know enough about deep learning to know what to say about this, but maybe some of you have some ideas.