ChatGPT updates System message and saving chat history - YouTube

ChatGPT Got A Secret Update Last Week, And It’s Performing At Its Best

ChatGPT updates System message and saving chat history - YouTube

Increasingly, AI companies are testing new and experimental models under strange names on the LMSYS Chatbot Arena and quietly deploying them without any release notes. Case in point, since last week, X users have been discussing improved performance on ChatGPT, whether for coding or creative tasks. Many believed it was a new OpenAI model, likely related to Project Strawberry — a new advanced reasoning engine.

Finally, OpenAI let the genie out of the bottle and revealed that ChatGPT is indeed running a new model. It’s not a new frontier-class model but an improved GPT-4o model. The release note says that it is an updated GPT-4o model optimized for chat, and its name is chatgpt-4o-latest. Based on qualitative feedback and experiment results, OpenAI has tuned the GPT-4o model for better performance.

OpenAI further says that it continues to remove bad data from the training dataset and add good ones along with “experimenting with new research methods.” This is where the intrigue begins. Project Strawberry is supposed to bring a new post-training method to improve reasoning. Is the new ChatGPT model already running the Strawberry engine?

I can’t say for sure, but many X users noticed that ChatGPT now uses multi-step reasoning to give correct answers. In this method, the model improves itself by generating various step-by-step reasoning rationales, and ultimately, coming to a correct conclusion.

By the way, OpenAI also tested the new ChatGPT model on LMSYS under the name “anonymous-chatbot” and it received more than 11,000 votes. The new “chatgpt-4o-latest” model has again taken the first spot, outranking other AI models from Google, Anthropic, and Meta. It has become the first model to score 1314 points in LMSYS Arena.

Does the New ChatGPT Model Pass the Vibe Test?

To test the updated ChatGPT model, I tried a few reasoning prompts, and well, I did not find much difference between the older and the latest model. I asked it to find the bigger number between 9.11 and 9.9, and it gave a correct response, just like before. I also ran other commonsense reasoning questions, and it was in line with the older model.

However, in some prompts, it still fails to get the answer right. For example, in response to the below prompt, it tells me to stack 9 eggs on top of the bottle, which is impossible.

Here we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner.
testing reasoning question on chatgpt

In another test, it says that there are only two “R”s in the word strawberry, which is again incorrect.

how many Rs are in strawberry?
testing tricky question on chatgpt

It might be the case that the new ChatGPT model has not been rolled out widely. Either way, with OpenAI’s new model, we can expect improvements in other key areas. If you have any queries, let us know in the comments below.

Today’s Wordle Answer And Hints (July 17, 2024)
NYT Strands Today: Hints, Answers & Spangram For September 1
NYT Strands Today – Hints, Answers & Spangram For July 7

ChatGPT updates System message and saving chat history - YouTube
ChatGPT updates System message and saving chat history - YouTube
ChatGPT 4 and Chat GPT 4 Secret features Revealed #chatgpt #chatgpt4 #
ChatGPT 4 and Chat GPT 4 Secret features Revealed #chatgpt #chatgpt4 #
Chat GPT 4: New Prompts, Updates, and Insights | Everything You Need to
Chat GPT 4: New Prompts, Updates, and Insights | Everything You Need to