Key Takeaways
- ChatGPT Voice offers Advanced Voice mode for Plus and Team users
- Standard Voice is free, but Advanced Voice requires a paid subscription
- ChatGPT Voice can translate languages, tell stories, respond quickly, and offer different voices
There are some things that are so successful or impactful that the product name becomes synonymous with the general idea. It’s far from the only search engine, but we all talk about “googling” information. We refer to images being “photoshopped” even if different software was used.
It’s fair to say that ChatGPT hasn’t quite made it to those levels yet. However, the AI chatbot has caught the attention to such an extent that it’s highly likely that it would be the first name most people would come up with if asked to name an AI product.
Even so, many of us haven’t used ChatGPT that much, if at all. You may be totally unaware that not only can you type messages into ChatGPT, but you can have entire voice conversations with it, just like with Siri or Alexa. However, the experience of talking to ChatGPT is far more like speaking to a real person than it is with Siri or Alexa, especially with the new Advanced Voice mode. Here’s everything you need to know about ChatGPT Voice.
Is ChatGPT Voice free to use?
The standard version is free, but you’ll need to pay for Advanced Voice
When ChatGPT Voice first launched back in September 2023, it was only available to people with ChatGPT Plus subscriptions, or Enterprise users. However, on 21 November 2023, Greg Brockman, one of the co-founders of OpenAI, posted on X that the feature was being rolled out for all free ChatGPT users, too. This means that you can use the standard ChatGPT Voice without needing to pay for a subscription.
However, the Advanced Voice feature is currently only available for ChatGPT Plus and ChatGPT Team users. If you don’t have one of these paid subscriptions, you won’t be able to use the Advanced Voice feature in ChatGPT. If you’re in the EU, Switzerland, Iceland, Norway, or Liechtenstein, you won’t yet have access to Advanced Voice even if you have a Plus or Team subscription. Users in the UK were initially unable to use the feature, but are now able to do so.
How to use the standard ChatGPT Voice
All you need to do is tap the headphones icon
Using ChatGPT Voice is incredibly easy to do. All you need is the ChatGPT app installed on your iOS or Android phone. You can also use the standard ChatGPT Voice mode on the ChatGPT app for macOS.
- Launch the ChatGPT app.
- In the bottom right of the screen, tap the headphones icon.
- The first time you open ChatGPT voice, you’ll be prompted to choose from one of the five voice options.
- Select your voice of choice and tap Confirm. You can change your voice selection at any time in the settings.
- A white circle will appear and bounce from side to side on your screen.
- Once it’s finished bouncing, ChatGPT Voice is ready to go.
- Start speaking. ChatGPT will respond when it thinks you’ve finished talking.
- If you find that ChatGPT interrupts you before you’ve finished speaking, tap and hold the screen whilst you’re talking. ChatGPT will then only respond once you let go of the screen.
- Once ChatGPT has responded, you can continue your conversation; the AI will remember everything that came earlier in the conversation, allowing you to have a natural back-and-forth conversation.
How to use Advanced Voice in ChatGPT
Plus and Team users get an even better experience
If you’re a Plus or Teams subscriber and the feature is supported in your locale, you can now access Advanced Voice in ChatGPT, which makes voice mode even more like talking to a real person.
- Update the ChatGPT app to the latest version.
- Open the app and start a new chat with a model that supports voice, such as GPT-4o.
- Tap the audio waveform icon at the right-hand side of the message bar.
- The first time you use Advanced Voice, you’ll be asked to choose a voice to use.
- A circular animation will appear on-screen.
- Start talking. Once you stop, ChatGPT will immediately respond without any noticeable pause.
- If you want to interrupt the reply, start talking again and ChatGPT will stop talking and start to listen.
- When you’ve finished the voice conversation, tap the X to close it.
What can you do with ChatGPT Voice?
Anything you can type, you can ask with your voice
The short answer is that ChatGPT Voice can do pretty much everything that ChatGPT can do through the text interface, without all that tedious typing. I asked ChatGPT Voice what it was capable of, and the response was that it could ‘answer questions, provide info on various topics, perform language translation, and even do a bit of storytelling.’ And that’s a fairly accurate summary.
You can ask ChatGPT voice for information, although there are limitations, which we’ll get into shortly. Unlike other voice assistants, such as Siri or Alexa, however, you can drill down into the information by asking follow-up questions. For example, if ChatGPT tells you that the biggest prehistoric shark was the Megalodon and that it could reach lengths of up to 20 meters, you can follow up by asking it to put that size into context based on real-life objects. ChatGPT Voice gave me a response that it’s equivalent to two school buses parked end to end, which is a fairly accurate estimate. Try doing that with Siri, Alexa, or Google Assistant, and you’ll get nowhere fast.
I asked ChatGPT Voice what it was capable of, and the response was that it could ‘answer questions, provide info on various topics, perform language translation, and even do a bit of storytelling.’ And that’s a fairly accurate summary.
Language translation is simple to do; just ask ChatGPT to translate a phrase into another language, and you’ll get a voice response of the translated phrase. This works well with common languages, such as Spanish, German, and French, with the voice speaking the foreign phrase with a fair attempt at the correct accent. But when I tried the same with Swedish, the voice read the translation phonetically, which sounded nothing like the actual Swedish pronunciation, so there are clear limitations. If you’re using the standard voice mode for free, you probably shouldn’t end your Duolingo streak just yet.
You can ask ChatGPT to tell you a story and choose from a choice of story styles. You’ll then be read a supposedly original story, which is actually quite soothing. ChatGPT told me a short story about a village where the clock in town would ring an extra chime each night. I was able to ask the AI to expand on the story, and it added more details to the story. It wasn’t the greatest story I’ve ever heard, but it’s still quite an impressive feat of technology, and potentially a great way to get your kids to sleep, as long as the story doesn’t take too dark a turn.
What can ChatGPT Advanced Voice do?
Everything that standard voice can, and a lot more
Advanced Voice uses the same model to provide its responses, so all the information that you get from Advanced Voice will be the same information that free users would get from the standard Voice feature. The big difference is in what Advanced Voice can do.
The most obvious difference is that, with Advanced Voice, the responses are almost instantaneous. Instead of waiting a few seconds for ChatGPT to think before giving a reply, Advanced Voice will respond without any pause at all. This makes it feel much more like you’re talking to a real person, as you can hold a true back-and-forth conversation without any unnatural pauses. Advanced Voice also gives you the ability to interrupt just by speaking again, which can be useful when ChatGPT’s response isn’t what you wanted, although it can feel a little rude.
Even more impressive is the ability to change the tone of voice in Advanced Mode. You can ask ChatGPT to tell you a story and use different voices for the different characters, and that’s exactly what it will do. It will still sound like the same person putting on different voices, however, just like it would with a real person telling a story. You can even get ChatGPT to speak more slowly or more quickly, up to a point.
Where the standard voice mode in ChatGPT struggles to say phrases is less common foreign languages, Advanced Voice has no such problems. It was able to translate phrases of Swedish with a good approximation of the accent, rather than reading them phonetically like the standard voice mode does. It can be a genuinely useful tool for language learning, allowing you to hold a conversation in a language you’re learning and even get ChatGPT to gently correct you if your grammar is wrong.
What can’t you do with ChatGPT Voice?
AI chatbots still have some major limitations
Whilst ChatGPT can do a lot, it has some significant limitations. Probably the biggest limitation is that the current version of ChatGPT was trained on huge amounts of data, but that data only went up to October 2023. If you want to know about anything that happened beyond that date, you’re bang out of luck. Aliens might have landed and taken over the Earth in November 2023, and ChatGPT wouldn’t have a clue about it.
In real terms, this means that there are a lot of things that Siri, Alexa, or Google Assistant can do that ChatGPT Voice can’t. For example, you can ask Siri for the latest football scores, and she can provide that information in an instant. Ask ChatGPT voice, and it will tell you that it doesn’t have real-time information and that you should look online. Not ideal for a virtual assistant.
ChatGPT Voice also won’t engage in harmful or illegal activities, provide personal information, or produce inappropriate content. And possibly most importantly, it won’t always give you accurate information. It’s become a trope among people that use AI chatbots, but if you try asking ChatGPT how many times the letter R appears in the word strawberry, the chances are that you’ll get the answer of two, despite there definitely being three.
A final thing that ChatGPT Voice really can’t do is be funny in an original way. If you want a real laugh, try asking ChatGPT Voice to come up with some ‘original’ jokes. The results are quite encouraging if you were previously worried about AI taking over the world, because the current models can’t even write jokes to the level of an eight-year-old. Your job may still be safe. For now…
What do the ChatGPT Voice voices sound like?
No more Sky but the new options are great
Honestly, they’re excellent. There are nine voices to choose from, and all of them sound impressively natural. OpenAI says that they’re built with a text-to-speech model that can generate human-like audio from just a few seconds of sample speech. Each voice is based on speech recorded by a voice actor, and you wouldn’t know that the phrases were generated rather than pre-recorded.
There are two British voices, Arbor and Vale. Arbor sounds a lot like Karl Urban’s accent in Amazon Prime’s The Boys, with Vale sounding a little like Mary Poppins. The rest are a mix of American accents, from the valley girl tones of Maple to the slightly over-the-top Breeze.
The ability to create natural sounding voices from just a few seconds of sample speech obviously has some rather sinister connotations; but it’s also quite cool that the tech is there to mimic the voice of someone else, just like in Mission Impossible. Now all we need are prosthetic faces.
What devices does ChatGPT Voice work on?
Advanced Voice is currently still limited to phones
Currently, ChatGPT Voice is available on the Android and iOS ChatGPT apps. You can also access the standard voice mode on the ChatGPT Mac app, but it doesn’t yet support Advanced Voice. A notification in the Mac app does state that Advanced Voice is on the way, so hopefully the superior voice mode will be available on Mac soon. However, with Advanced Voice taking several months to be released after it was first showcased, Mac users may still have quite a wait.