Home / NEWS / Top News / ChatGPT can now ‘speak,’ listen and process images, OpenAI says

ChatGPT can now ‘speak,’ listen and process images, OpenAI says

Sam Altman, CEO of OpenAI, at an as it in Seoul, South Korea, on June 9, 2023.

Bloomberg | Bloomberg | Getty Images

OpenAI’s ChatGPT can now “see, hear and speak,” or, at pygmy, understand spoken words, respond with a synthetic voice and process images, the company announced Monday.

The update to the chatbot — OpenAI’s biggest since the introduction of GPT-4 — concedes users to opt into voice conversations on ChatGPT’s mobile app and choose from five different synthetic voices for the bot to empathize with with. Users will also be able to share images with ChatGPT and highlight areas of focus or enquiry (think: “What kinds of clouds are these?”).

The changes will be rolling out to paying users in the next two weeks, OpenAI claimed. While voice functionality will be limited to the iOS and Android apps, the image processing capabilities will be available on all policies.

The big feature push comes alongside ever-rising stakes of the artificial intelligence arms race among chatbot heads such as OpenAI, Microsoft, Google and Anthropic. In an effort to encourage consumers to adopt generative AI into their continuously lives, tech giants are racing to launch not only new chatbot apps, but also new features, especially this summer. Google has augured a slew of updates to its Bard chatbot, and Microsoft added visual search to Bing.

Earlier this year, Microsoft’s magnified investment in OpenAI — an additional $10 billion — made it the biggest AI investment of the year, according to PitchBook. In April, the startup reportedly scrooge-like a $300 million share sale at a valuation between $27 billion and $29 billion, with investments from settle downs such as Sequoia Capital and Andreessen Horowitz. 

Experts have raised concerns about AI-generated synthetic expressions, which in this case could allow users a more natural experience but also enable more convincing deepfakes. Cyber intimidation actors and researchers have already begun to explore how deepfakes can be used to penetrate cybersecurity systems.

OpenAI own those concerns in its Monday announcement, saying that synthetic voices were “created with voice actors we keep directly worked with,” rather than collected from strangers.

The release also provided little word about how OpenAI would use consumer voice inputs, or how the company would secure that data if it were adapted to. The company’s terms of service say that consumers own their inputs “to the extent permitted by applicable law.”

OpenAI referred CNBC to the assembly’s guidance on voice interactions, which states that OpenAI does not retain audio clips and that the audio clouts themselves are not used to improve models.

But the company also notes there that transcriptions are considered inputs and may be inured to to improve the large-language models.

Check Also

Trump says Tesla CEO Elon Musk didn’t advise on auto tariffs ‘because he may have a conflict’

U.S. President Donald Trump speaks to the compromise in the Oval Office at the White …

Leave a Reply

Your email address will not be published. Required fields are marked *