Latest Updates of All Collections
All Notes
Line Search Methods Created: 2023-02-15
[The learning rate] is often the single most important hyperparameter and one should always make sure that it has been tuned. — Yoshua Bengio Introduction The optimization iteration is given by [x_{k+1} \gets x_k - \alpha_k p_k,] where \(\alpha_k \in \mathbb R\) is the learning rate and \(p_k \in \mathbb...
The Intuition of Natural Gradient Modified: 2023-01-03
In this post, I will discuss the intuition of the natural gradient methods. This post is largely based on amari1998why1 with a simple example of a quadratic function. Riemannian Geometry without Tears In the Euclidean geometry, we measure the length of a curve \(u: \mathbb R \to \mathbb R^{d}\) by...