On May 13, OpenAI officially launched GPT-4o, its next AI model. SinceGPT-4is already the basis of much of the hype around generative AI, 4o could be poised to send shockwaves throughout the industry. Here’s everything that OpenAI revealed about the new AI technology, and why it’s a big step forward.

How to use ChatGPT to make AI-generated art and images

Whether you need a stock photo or a portrait of Big Foot, ChatGPT can now use DALL-E AI to generate images. Here’s how, plus tips and tricks.

1Real-time voice conversations

No keyboard required

There’s a strong focus on real-time voice exchanges with GPT-4o. The model can pick up on the tone in your voice, and will try to respond in an appropriate tone of its own. In some circumstances you may even ask it to add more or less drama to its response, or use a different voice – like a robotic one for a story being told by a robot, or singing for the end of a fairytale.

Perhaps more significantly, you can interrupt the AI at any time, say if it’s getting a request wrong, or you want to change its tone or voice mid-stream. 4o will do its best to correct itself, using the rest of a conversation as context. In a staged demonstration by OpenAI this all felt very natural, with the AI even apologizing when someone pointed out that it was missing some critical source data.

Chat GPT AI image creation example

6 ways to get Spotify Premium for free

If you’re finally finished with an ad ruining your perfect playlist, here’s how to get Premium benefits for free.

2Better vision capabilities and multilingual support

Words aren’t always enough

GPT-4o can also answer questions about photos and desktop screenshots. These may be similar to ones you’d askMeta/Ray-Ban’s Smart Glassesor the Humane AI pin – something like “What brand of pants are these?” – but are potentially more complex, such as explaining a block of app code, or translating a restaurant menu. OpenAI says that down the road, 4o may be capable of even more complicated tasks, such as watching live sports and explaining the rules involved. For now the focus appears to be on static images rather than video.

Related to vision are improved multilingual functions. 4o is claimed to have better performance across 50 different languages, with an API twice as fast as the one for GPT-4 Turbo.

Two people talking to GPT-4o on a phone.

Amazon’s simple vision for the Echo Frames in an Apple Vision Pro era

I met with Jean Wang, Amazon’s Director of Smart Eyewear, to learn how the latest Echo Frames keep things simple in a AR/VR headset world.

3You can create images with readable text

Extending the possibilities of AI art

Generating images with legible text has long been a weak point of AI, but GPT-4o appears more capable in this regard. Text can not only be legible, but arranged in creative ways, such as typewriter pages, a movie poster, or using poetic typography. It also appears to be adept at emulating handwriting, to the point that some prompts might create images indistinguishable from real human output.

Text can not only be legible, but arranged in creative ways, such as typewriter pages, a movie poster, or using poetic typography.

Someone holding an iPhone with Spotify on it.

You can even ask 4o to include doodles in the margins.

With GPT-4o, ChatGPT can generate art with text that’s actually readable

OpenAI’s new GPT-4o model for ChatGPT finally fixes one of AI’s biggest flaws: Text on AI-generated images.

4Native Mac and Windows apps

Quicker, more powerful access

OpenAI finally has a ChatGPT desktop app. Mac users get first dibs

A Windows version will be launched “later this year,” according to OpenAI.

5Everyone can access GPT-4o for free

Down with gatekeeping

I tested ChatGPT Plus against Copilot Pro to see which AI is better

I created a ChatGPT Plus vs. Copilot Pro battle to see which AI chatbot subscription service is really worth your $20 every month.

Q: What is GPT-4o?

GPT-4o is an evolution of the GPT-4 AI model, currently used in services like OpenAI’s own ChatGPT. The O stands for “omni” – not because it’s omniscient, but because it unifies voice, text, and vision. That contrasts with GPT-4, which is mostly about typed text interactions, exceptions like image generation and text-to-speech transcription notwithstanding.

Q: How and when is GPT-4o going to be available?

It’s not clear when we’ll see GPT-4o migrate outside of ChatGPT, for example to Microsoft Copilot. But OpenAI is opening the chatbots in the GPT Store to free users, and it would be odd if third parties didn’t leap on technology easily accessible through ChatGPT. The company is being cautious, however – for its voice and video tech, it’s beginning with “a small group of trusted partners,” citing the possibility of abuse.

Analyzing code with the native ChatGPT app for Mac.

Christina wearing 3rd-Gen Amazon Echo Frames around NYC