October 2024 – AI-ASSeSS

Growing models in the hidden size dimension

As explained in our previous article, growing a model in most dimensions in quite simple, but increasing the hidden size comes with a few problems. This article dives deep and shows how it can be done. Why, again? We can simply grow a model’s MLP (intermediate size) or number of Read more…

By coldint, 9 monthsOctober 25, 2024 ago

incentive

On incentive (part 3)

In this post, we revisit model comparison metrics. Should we compare sample-by-sample, or group several samples together? Or does it make more sense to “pack” samples (i.e. join multiple samples with an EOS token in between)? Does length matter? And what’s up with these “pages” in the dataset? TL;DR A Read more…

By coldint, 10 monthsOctober 14, 2024 ago

training

Growing models

In this blog, and in our Discord channel, we discuss training in detail. A topic that is often overlooked, is how to grow a model. Especially in incentivized, collaborative and distributed training, this is a key ingredient. This post explores the concept of model growth with concrete Python code examples Read more…

By coldint, 10 monthsOctober 1, 2024 ago