2025-12-16
There are also some things Bolmo lets us do which we just can't do using subword-level LMs. For example, we can increase the compression in bytes per patch to achieve an arbitrary speedup⚡ In contrast, subword-level LMs eventually get yote back by the Softmax Bottleneck. [image]
VentureBeat
Allen Institute for AI launches Bolmo 7B and Bolmo 1B, claiming they are “the first fully open byte-level language models”, built on its Olmo 3 models
and every token gets the same compute, regardless of complexity. Benjamin Minixhofer / @bminixhofer : There are also some things Bolmo lets us do which we just can't do using subwo...
We are releasing Bolmo today! Bolmo is the best byte-level model so far. It comes close to and sometimes surpasses Olmo 3. Bolmo also performs competitively in terms of speed & is fully open. I was skeptical of byte-level models for a long time but I finally switched camps🧵 [image]
VentureBeat
Allen Institute for AI launches Bolmo 7B and Bolmo 1B, claiming they are “the first fully open byte-level language models”, built on its Olmo 3 models
and every token gets the same compute, regardless of complexity. Benjamin Minixhofer / @bminixhofer : There are also some things Bolmo lets us do which we just can't do using subwo...