ChatGPT can now ‘speak,’ listen and process images, OpenAI says

Sam Altman, CEO of OpenAI, at an as it in Seoul, South Korea, on June 9, 2023.

Bloomberg | Bloomberg | Getty Images

OpenAI’s ChatGPT can now “see, hear and speak,” or, at pygmy, understand spoken words, respond with a synthetic voice and process images, the company announced Monday.

The update to the chatbot — OpenAI’s biggest since the introduction of GPT-4 — concedes users to opt into voice conversations on ChatGPT’s mobile app and choose from five different synthetic voices for the bot to empathize with with. Users will also be able to share images with ChatGPT and highlight areas of focus or enquiry (think: “What kinds of clouds are these?”).

The changes will be rolling out to paying users in the next two weeks, OpenAI claimed. While voice functionality will be limited to the iOS and Android apps, the image processing capabilities will be available on all policies.

The big feature push comes alongside ever-rising stakes of the artificial intelligence arms race among chatbot heads such as OpenAI, Microsoft, Google and Anthropic. In an effort to encourage consumers to adopt generative AI into their continuously lives, tech giants are racing to launch not only new chatbot apps, but also new features, especially this summer. Google has augured a slew of updates to its Bard chatbot, and Microsoft added visual search to Bing.

Earlier this year, Microsoft’s magnified investment in OpenAI — an additional $10 billion — made it the biggest AI investment of the year, according to PitchBook. In April, the startup reportedly scrooge-like a $300 million share sale at a valuation between $27 billion and $29 billion, with investments from settle downs such as Sequoia Capital and Andreessen Horowitz.

Experts have raised concerns about AI-generated synthetic expressions, which in this case could allow users a more natural experience but also enable more convincing deepfakes. Cyber intimidation actors and researchers have already begun to explore how deepfakes can be used to penetrate cybersecurity systems.

OpenAI own those concerns in its Monday announcement, saying that synthetic voices were “created with voice actors we keep directly worked with,” rather than collected from strangers.

The release also provided little word about how OpenAI would use consumer voice inputs, or how the company would secure that data if it were adapted to. The company’s terms of service say that consumers own their inputs “to the extent permitted by applicable law.”

OpenAI referred CNBC to the assembly’s guidance on voice interactions, which states that OpenAI does not retain audio clips and that the audio clouts themselves are not used to improve models.

But the company also notes there that transcriptions are considered inputs and may be inured to to improve the large-language models.

ALEKBO.COM – News. Money. Technology. People.

ChatGPT can now ‘speak,’ listen and process images, OpenAI says

Related Articles

Check Also

Trump says Tesla CEO Elon Musk didn’t advise on auto tariffs ‘because he may have a conflict’

Leave a Reply Cancel reply

China’s artificial intelligence boom might help mitigate some tariff pain

Chinese consumer companies signal spending is picking up again

U.K. inflation cools to 2.8% in February but respite could be short-lived

Japan and Brazil vow to bolster strategic ties as security and trade worries rise

Alibaba launches new open-source AI model for ‘cost-effective AI agents’

Netflix Loses Tuesday’s Breakout on Negative Earnings

Businessman Braun projected to win Indiana GOP Senate primary, will take on Joe Donnelly: NBC News

PR: Marketing Cloud Lydian Announces New Investment from Prolific Blockchain Investor, Chris Rouland and Announcement of New Advisors

ASIC Resistance Increasingly Hot Topic in Crypto as Monero Forks

How the Samsung Galaxy S10 compares to the Google Pixel 3 (GOOG, GOOGL)

Netflix Loses Tuesday’s Breakout on Negative Earnings

Businessman Braun projected to win Indiana GOP Senate primary, will take on Joe Donnelly: NBC News

PR: Marketing Cloud Lydian Announces New Investment from Prolific Blockchain Investor, Chris Rouland and Announcement of New Advisors

ASIC Resistance Increasingly Hot Topic in Crypto as Monero Forks

How the Samsung Galaxy S10 compares to the Google Pixel 3 (GOOG, GOOGL)

Conflict, extreme weather and disinformation top global risks in 2025, Davos survey says

The big banks have become the ideal stocks for this market, Jim Cramer says

Justice Department sues to block JetBlue’s acquisition of Spirit Airlines

As the election winds down, worries of ‘bond vigilantes’ and inflation hit markets

Chinese EV giant BYD outpaces Tesla with annual sales of more than $100 billion

China’s artificial intelligence boom might help mitigate some tariff pain

Chinese consumer companies signal spending is picking up again

U.K. inflation cools to 2.8% in February but respite could be short-lived

Japan and Brazil vow to bolster strategic ties as security and trade worries rise

Alibaba launches new open-source AI model for ‘cost-effective AI agents’