This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/InTheDarknesBindThem on 2024-10-03 19:30:58+00:00.


“We measured progress with over 20 automated internal evaluations. We used novel synthetic data generation techniques, such as distilling outputs from OpenAI o1-preview, to post-train the model for its core behaviors. This approach allowed us to rapidly address writing quality and new user interactions, all without relying on human-generated data.”

Please correct me if Im wrong but; if im reading this right they were able to use o1-preview in the place of where they used to use humans for fine tuning responses and getting key behaviors to work in post training.

Or, in short, AI fine tuning AI (under supervision).