女生小视频

Technology

ChatGPT got an upgrade to make it seem more human

OpenAI's new ChatGPT model, called GPT-4o, provides more human-like interactions through a voice mode, and it is capable of conversations that incorporate text, audio and video in real time

By Jeremy Hsu

13 May 2024

OpenAI’s latest model offers a more human-like conversational experience

JIYI Image / Alamy

OpenAI announced its newest artificial intelligence model, called GPT-4o, which will soon power some versions of the company鈥檚 ChatGPT product. The upgraded ChatGPT can swiftly respond to text, audio and video inputs from its real-time conversational partner 鈥 all while speaking with inflections and wording that convey a strong sense of emotion and personality.

The company demonstrated the emotional mimicry of the new voice mode during a supposedly live OpenAI presentation, featuring both the ChatGPT mobile app and a new desktop app, on 13 May. Speaking in a female-sounding voice and responding to the name ChatGPT, the new AI鈥檚 conversational capabilities seemed more akin to the personable AI voiced by Scarlett Johansson in the 2013 science fiction film Her than to the more canned and robotic responses of typical voice assistant technologies.

鈥淭he new GPT-4o voice-to-voice interaction聽more closely parallels human-human interaction,鈥 says at the University of California, Davis. 鈥淎 big part of this is the short lag times鈥 but an even bigger part is the level of emotional expressiveness聽the voice generates.鈥

During a conversation with company CTO Mira Murati and two other employees, the GPT-4o-powered ChatGPT advised OpenAI鈥檚 Mark Chen on his heavy and fast-paced breathing by saying 鈥淲hoa, slow down, you鈥檙e not a vacuum cleaner鈥 and then suggesting a breathing exercise. The AI also visually examined a drawing by OpenAI鈥檚 Barret Zoph, which included words and a heart, by responding in gushing tones: 鈥淎w, I see you wrote I love ChatGPT, that is so sweet of you.鈥

The new ChatGPT also verbally instructed its conversational partners on solving a simple linear equation, explained the function of computer code and interpreted a chart showing temperature lines peaking in the summer months. When prompted, the AI even retold a made-up bedtime story several times, switching between increasingly dramatic narrations and singing the ending.

Free newsletter

Sign up to The Weekly

The best of New 女生小视频, including long-reads, culture, podcasts and news, each week.

New 女生小视频. Science news and long reads from expert journalists, covering developments in science, technology, health and the environment on the website and the magazine.

The new voice mode will first become available for paid subscribers of ChatGPT Plus in the coming weeks, said Sam Altman, CEO of OpenAI, in a on the platform X.

ChatGPT was able to recover conversationally even from the occasional technical glitch. When asked to interpret the facial expressions and emotions in a selfie of Zoph, the AI first suggested that it was looking at a wooden surface from a previous image before being prompted to evaluate the latest image.

鈥淎hh, there we go 鈥 it looks like you鈥檙e feeling pretty happy and cheerful with a聽big smile and a touch of excitement,鈥 said ChatGPT. 鈥淲hatever is going on, it looks like you鈥檙e in a good mood. Care to share the source of those good vibes?鈥

When told that it was because the live demo with ChatGPT was showcasing how 鈥渦seful and amazing you are鈥, the AI responded: 鈥淪top it, you鈥檙e making me blush.鈥

But Murati acknowledged that the updated version of ChatGPT powered by GPT-4o 鈥 which the company says will eventually be made available to even free ChatGPT users 鈥 comes with new safety risks because of how it incorporates and interprets real-time information. She said that OpenAI has been working on building in 鈥渕itigations against misuse鈥.

鈥淗aving seamless multimodal conversations is really difficult, so the demos are impressive,鈥 says at Princeton University in New Jersey. 鈥淏ut as you add more modalities, safety becomes much more difficult and important 鈥 it will likely take some time to identify potential safety failure modes with such an expansion of inputs that the model makes use of.鈥

Henderson also described himself as 鈥渃urious鈥 to see OpenAI鈥檚 privacy terms once ChatGPT users start sharing input such as live audio and video, and whether free users can opt out of data collection that may be used to train future OpenAI models.

鈥淪ince the model appears to be hosted off-device, the fact that you could be sharing your desktop screen with the model over the internet or continually recording audio or video seems to scale up the challenge for this particular product launch, if the plan is to store and use that data,鈥 he says.

A more anthropomorphised AI chatbot also represents another threat: a bot that can fake empathy through voice conversations could potentially sound both more personable and persuasive to people, according to 聽by Cohn and her colleagues. That raises the risk of people being more inclined to trust potentially inaccurate information and prejudiced stereotypes generated by such large language models.

鈥淭his has important implications for how people both search and receive guidance from large language models, particularly as they do not always generate accurate information,鈥 says Cohn.

Topics:

Sign up to our weekly newsletter

Receive a weekly dose of discovery in your inbox. We'll also keep you up to date with New 女生小视频 events and special offers.

Sign up
Piano Exit Overlay Banner Mobile Piano Exit Overlay Banner Desktop