tech:

taffy

OpenAI introduces GPT-4o: A free multimodal AI model for real-time audio, vision, and text

OpenAI has announced the release of GPT-4o, a new flagship AI model that can process and generate a combination of text, audio, and images in real-time. The “o” in GPT-4o stands for “omni,” signifying the model’s ability to handle multiple modalities.

GPT-4o is a single model trained end-to-end across text, vision, and audio. This allows the model to process all inputs and outputs using the same neural network, preserving information such as tone, multiple speakers, and background noises. GPT-4o is the first model combining all of these modalities, says OpenAI.

GPT-4o aims to provide a more natural human-computer interaction by accepting any combination of text, audio, and image inputs and generating corresponding outputs. The model can respond to audio inputs in as little as 232 milliseconds on average, which is comparable to human response time in a conversation.

The new model matches the performance of GPT-4 Turbo on text in English and code while offering improved performance on text in non-English languages. GPT-4o is also faster and 50% cheaper in the API compared to its predecessor. The model particularly excels in vision and audio understanding compared to existing models, according to OpenAI.

GPT-4o’s text and image inputs and text outputs are being released publicly, while other modalities will be gradually rolled out as the company. Audio outputs will initially be limited to preset voices and will adhere to existing safety policies.

The new model’s capabilities will be iteratively rolled out, with extended red team access starting today. GPT-4o’s text and image capabilities are now available in ChatGPT, with a new version of Voice Mode featuring GPT-4o planned for release in alpha within ChatGPT Plus in the coming weeks.

Developers can access GPT-4o in the API as a text and vision model, with support for audio and video capabilities planned for release to a small group of trusted partners in the coming weeks.

GPT-4o available in the free tier, and to Plus users with up to 5x higher message limits.

[Image courtesy: OpenAI]

Just in

Tembo raises $14M

Cincinnati, Ohio-based Tembo, a Postgres managed service provider, has raised $14 million in a Series A funding round.

Raspberry Pi is now a public company — TC

Raspberry Pi priced its IPO on the London Stock Exchange on Tuesday morning at £2.80 per share, valuing it at £542 million, or $690 million at today’s exchange rate, writes Romain Dillet. 

AlphaSense raises $650M

AlphaSense, a market intelligence and search platform, has raised $650 million in funding, co-led by Viking Global Investors and BDT & MSD Partners.

Elon Musk’s xAI raises $6B to take on OpenAI — VentureBeat

Confirming reports from April, the series B investment comes from the participation of multiple known venture capital firms and investors, including Valor Equity Partners, Vy Capital, Andreessen Horowitz (A16z), Sequoia Capital, Fidelity Management & Research Company, Prince Alwaleed Bin Talal and Kingdom Holding, writes Shubham Sharma. 

Capgemini partners with DARPA to explore quantum computing for carbon capture

Capgemini Government Solutions has launched a new initiative with the Defense Advanced Research Projects Agency (DARPA) to investigate quantum computing's potential in carbon capture.