training
Growing models in the hidden size dimension
As explained in our previous article, growing a model in most dimensions in quite simple, but increasing the hidden size comes with a few problems. This article dives deep and shows how it can be done. Why, again? We can simply grow a model’s MLP (intermediate size) or number of Read more…