rm_rafailov · TEXXR

2025-10-02

The biggest advantage of Tinker is it allows you to run your own environments or interaction loops and will hugely accelerate training custom agents!

2025-10-02 View on X

Wired

Mira Murati's Thinking Machines Lab launches its first product, Tinker, an API for fine-tuning language models, in private beta, with support for Qwen and Llama

Today, we are launching Tinker, a flexible API for fine-tuning language models. Moneycontrol : Ex-OpenAI CEO Mira Murati stealth AI lab launches its first ever product Matthias Bas...

View original

Very excited to share what I have been working on with a great team of people at @thinkymachines. Tinker is a whole new way to train and customize models all the way up to frontier scale. Most importantly, it allows everyone to use their own code, data, tools and environments,

2025-10-02 View on X

Wired

Mira Murati's Thinking Machines Lab launches its first product, Tinker, an API for fine-tuning language models, in private beta, with support for Qwen and Llama

Today, we are launching Tinker, a flexible API for fine-tuning language models. Moneycontrol : Ex-OpenAI CEO Mira Murati stealth AI lab launches its first ever product Matthias Bas...

View original

2024-06-14

@natolambert I am somewhat doubtful they used the original MDPO algorithm in a token-level space. There are ways to frame online *PO as mirror descent, I would assume they uses some version of that in combination with the LOO approach from Cohere.

2024-06-14 View on X

Interconnects

A look at Apple's technical approach to AI, including core model performance, alignment strategies, and adapter and on-device strategy

Apple Intelligence makes a lot of sense when you get out of the AI bubble. Plus, the cool technical details Apple shared about their language models “thinking different.”

View original

2023-12-02

I saw this challenge https://aimoprize.com/ to develop an AI that can win a gold medal at the IMO. I competed at that level a couple of times (only silver medals though) and have been working on RL and LLMs for a bit. Here is my thoughts on what the challenges are: 1/N

2023-12-02 View on X

Wall Street Journal

Algorithmic trading firm XTX Markets announces the $10M AI-MO Prize for a public AI model than can win a gold medal in the International Mathematical Olympiad

AI isn't smart enough to win a gold medal at the math Olympics—yet. Can a billionaire's money change that? — Alex Gerko cannot wait to lose $10 million.

View original