OpenAI has released GPT-4 – A new powerful AI model which understands images has been released by OpenAI. OpenAI calls this new powerful model GPT-4, referring to it as “the latest milestone in its effort in scaling up deep learning.”
Pricing is $0.03 per 1,000 “prompt” tokens (about 750 words) and $0.06 per 1,000 “completion” tokens (again, about 750 words). Tokens represent raw text; for example, the word “fantastic” would be split into the tokens “fan,” “tas,” and “tic.” Prompt tokens are parts of words fed into GPT-4, whereas completion tokens are the content generated by GPT-4.
Stripe is another early adopter, using GPT-4 to scan business websites and provide a summary to customer support staff. Duolingo turned GPT-4 into a new language learning subscription tier. Morgan Stanley is developing a GPT-4-powered system that will retrieve information from company documents and serve it to financial analysts. GPT-4 is also being used by Khan Academy to create an automated tutor.
GPT-4 can generate text as well as accept image and text inputs, which is an improvement over GPT-3.5, which only accepted text, and performs at a “human level” on a variety of professional and academic benchmarks. GPT-4, for example, passes a simulated bar exam with a score in the top 10% of test takers; GPT-3.5’s score was in the bottom 10%.
According to the company, OpenAI spent six months “iteratively aligning” GPT-4 using lessons from an internal adversarial testing program as well as ChatGPT, resulting in “best-ever results” on factuality, steerability, and refusing to go outside of guardrails. GPT-4, like previous GPT models, was trained using publicly available data, including data from public web pages and data licensed by OpenAI.
OpenAI collaborated with Microsoft to build a “supercomputer” in the Azure cloud from the ground up, which was used to train GPT-4.
The distinction between GPT-3.5 and GPT-4 can be subtle in casual conversation, OpenAI wrote in a blog post announcing GPT-4. “The difference becomes apparent when the task’s complexity reaches a sufficient threshold — GPT-4 is more reliable, creative, and capable of handling far more nuanced instructions than GPT-3.5.”
One of GPT-4’s most intriguing features is its ability to understand both images and text. GPT-4 can caption — and even interpret — relatively complex images, such as identifying a Lightning Cable adapter from a photograph of an iPhone plugged in.
The image understanding capability is not yet available to all OpenAI customers; for the time being, OpenAI is testing it with a single partner, Be My Eyes. Be My Eyes’ new Virtual Volunteer feature, which is powered by GPT-4, can answer questions about images sent to it. A blog post by the company explains how it works:
“For example, if a user sends a picture of the inside of their refrigerator, the Virtual Volunteer will not only be able to correctly identify what’s in it but also extrapolate and analyze what can be prepared with those ingredients. The tool can also then offer a number of recipes for those ingredients and send a step-by-step guide on how to make them.”
The aforementioned steerability tooling could be a more significant improvement in GPT-4. OpenAI introduces a new API capability, “system” messages, with GPT-4, which allows developers to prescribe style and tasks by describing specific directions. System messages, which will be added to ChatGPT in the future, are essentially instructions that set the tone — and set boundaries — for the AI’s subsequent interactions.
For example, a system message might read: “You are a tutor that always responds in the Socratic style. You never give the student the answer, but always try to ask just the right question to help them learn to think for themselves. You should always tune your question to the interest and knowledge of the student, breaking down the problem into simpler parts until it’s at just the right level for them.”
Even with system messages and other upgrades, OpenAI admits that GPT-4 is far from perfect. It still “hallucinates” facts and makes errors in reasoning, sometimes with great confidence. GPT-4 incorrectly described Elvis Presley as the “son of an actor” in one example cited by OpenAI.
“GPT-4 generally lacks knowledge of events that have occurred after the vast majority of its data cut off (September 2021), and does not learn from its experience,” OpenAI wrote. “It can sometimes make simple reasoning errors which do not seem to comport with competence across so many domains or be overly gullible in accepting obvious false statements from a user. And sometimes it can fail at hard problems the same way humans do, such as introducing security vulnerabilities into code it produces.”
OpenAI does note, however, that it has made improvements in specific areas; for example, GPT-4 is less likely to refuse requests on how to synthesize dangerous chemicals. According to the company, GPT-4 is 82% less likely to respond to requests for “disallowed” content than GPT-3.5 and responds to sensitive requests — such as medical advice and anything pertaining to self-harm — in accordance with OpenAI’s policies 29% more often.