OpenAI announced GPT-4 Turbo at its first developer conference, an upgraded version of its flagship text-generating AI model, GPT-4, that the company claims is both "more powerful" and less expensive.
GPT-4 Turbo is available in two versions: one that only analyzes text and another that recognizes the context of both text and graphics. The text-analyzing model is now accessible in preview via an API, and OpenAI says both will be broadly available "in the coming weeks."
They cost $0.01 for 1,000 input tokens (750 words), where "tokens" represent raw text fragments (for example, the word "fantastic" broken into "fan," "tas," and "tic"), and $0.03 per 1,000 output tokens. (Input tokens are tokens fed into the model, and output tokens are tokens generated by the model depending on the input tokens.) The price of the image-processing GPT-4 Turbo will be determined by the size of the image. Passing an image with 10801080 pixels to GPT-4 Turbo, for example, will cost $0.00765, according to OpenAI.
We optimized performance so we’re able to offer GPT-4 Turbo at a 3x cheaper price for input tokens and a 2x cheaper price for output tokens compared to GPT-4.
GPT-4 Turbo provides several advantages to GPT-4, including a more recent knowledge base from which to respond to requests.
GPT-4 Turbo, like all language models, is primarily a statistical tool for word prediction. GPT-4 Turbo learned how frequently words are to occur based on patterns, including the semantic context of surrounding text, after being fed a vast amount of examples, mostly from the web. For example, if an email ends with the phrase "Looking forward...", GPT-4 Turbo might finish it with "... to hearing back."
GPT-4 was trained on web data till September 2021, while the knowledge cut-off for GPT-4 Turbo is April 2023. That should mean that questions about current events, or events that occurred before the new cut-off date, will generate more accurate replies. GPT-4 Turbo features a larger context window as well.
Context window refers to the text that the model considers before generating any new text, measured in tokens. Models with limited context windows have a tendency to "forget" the substance of even recent talks, causing them to stray – frequently in dangerous ways.
GPT-4 Turbo has a 128,000-token context window, which is four times larger than GPT-4 and the largest of any commercially available model, topping even Anthropic's Claude 2. (Claude 2 supports up to 100,000 tokens; Anthropic claims to be testing a 200,000-token context window but has yet to disclose it publicly.) Indeed, 128,000 tokens equate to approximately 100,000 words or 300 pages, which is roughly the length of "Wuthering Heights," "Gulliver's Travels," and "Harry Potter and the Prisoner of Azkaban."
GPT-4 Turbo also has a new "JSON mode," which assures that the model returns proper JSON — the open standard file format and data transfer protocol. According to OpenAI, this is important in web apps that transmit data, such as those that convey data from a server to a client so it can be displayed on a web page. Other new settings will allow developers to make the model return "consistent" completions more of the time, as well as log probabilities for the most likely output tokens generated by GPT-4 Turbo for more niche applications.
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g. ‘always respond in XML’). And GPT-4 Turbo is more likely to return the right function parameters.
In developing GPT-4 Turbo, OpenAI did not overlook GPT-4. The business has started an experimental access program to fine-tune GPT-4. Unlike the fine-tuning effort for GPT-3.5, GPT-4's predecessor, the GPT-4 program will include additional oversight and assistance from OpenAI teams, according to the company, owing to technical challenges.
Preliminary results indicate that GPT-4 fine-tuning requires more work to achieve meaningful improvements over the base model compared to the substantial gains realized with GPT-3.5 fine-tuning.
In other news, OpenAI has doubled the tokens-per-minute rate limit for all paying GPT-4 users. However, price will stay unchanged at $0.03 per input token and $0.06 per output token (for the GPT-4 model with an 8,000-token context window) or $0.06 per input token and $0.012 per output token (for the GPT-4 model with a 32,000-token context window).