This is an automated archive made by the Lemmit Bot.
The original was posted on /r/machinelearning by /u/Wiskkey on 2023-09-21 17:01:28.
This Twitter thread claims that OpenAI’s new language model gpt-3.5-turbo-instruct can “readily” beat Lichess Stockfish level 4. This tweet shows the style of prompts that are being used to get these results with the new language model.
I used website parrotchess[dot]com (discovered here) to play multiple games of chess purportedly pitting this new language model vs. various levels of Fairy-Stockfish 14 at website Lichess. My current results for all completed games: The language model is 2-0 vs. Fairy-Stockfish 14 level 5 (game 1, game 2), and 0-2 vs. Fairy-Stockfish 14 level 6 (game 1, game 2). One game I aborted because the language model apparently tried an illegal move.
The following is a screenshot from the aforementioned chess web app showing the end state of the first game vs. Fairy-Stockfish 14 level 5:
There are several other purported ways to play chess against the new language model if you have access to the OpenAI API. The first is chess web app gptchess[dot]vercel[dot]app (discovered in this Twitter thread). Another person modified that chess web app to additionally allow various levels of the Stockfish chess engine to autoplay, resulting in chess web app chessgpt-stockfish[dot]vercel[dot]app (discovered in this tweet).
Post Chess as a case study in hidden capabilities in ChatGPT from last month covers a different prompting style used for the older chat-based GPT 3.5 Turbo language model. If I recall correctly from my tests, using that prompt style with the older language model can defeat Stockfish level 2 at Lichess, but I haven’t been successful in using it to beat Stockfish level 3. In my tests, both the quality of play and frequency of illegal attempted moves seems to be better with the new prompt style with the new language model compared to the older prompt style with the older language model.
Related article: Large Language Model: world models or surface statistics?