Hopp til hovedinnhold
Fredag 24. april 2026AI-nyheter, ferdig filtrert for ledere
SISTE:
DeepSeek åpner V4 Preview med 1M kontekst og API-kompatibilitetOpenAI lanserer GPT-5.5 for ChatGPT og CodexAnthropic og Amazon utvider AI-alliansen med 5 GW kapasitet og ny investeringDeepSeek åpner V4 Preview med 1M kontekst og API-kompatibilitetOpenAI lanserer GPT-5.5 for ChatGPT og CodexAnthropic og Amazon utvider AI-alliansen med 5 GW kapasitet og ny investering
Google Launches Gemini Embedding 2: One Vector Model for Text, Image, Video, and Audio
GoogleGeminiEmbeddingsMultimodalRAG

Google Launches Gemini Embedding 2: One Vector Model for Text, Image, Video, and Audio

JH
Joachim Høgby
26. mars 202626. mars 20264 min lesingKilde:

Google DeepMind has released Gemini Embedding 2 in public preview — the company's first natively multimodal embedding model that maps text, images, video, audio, and documents into a single shared vector space.

The model launched on March 10, 2026 and is available through the Gemini API and Vertex AI. Built on the Gemini architecture, it is designed to simplify complex data pipelines where different modalities have traditionally required separate embedding models.

Gemini Embedding 2 supports text inputs up to 8,192 tokens, up to six images per request in PNG and JPEG formats, up to 120 seconds of video in MP4 and MOV, native audio ingestion without transcription, and PDFs up to six pages in length directly.

The model incorporates Matryoshka Representation Learning, enabling flexible output dimensions with a default of 3,072. This allows developers to tune vector dimensions to match the requirements of different storage backends and search systems.

For developers building RAG systems, semantic search, sentiment analysis, and clustering over multimodal datasets, this is a meaningful simplification. Rather than wiring together separate text and image embedders, a single model handles the entire pipeline.

For enterprises managing product catalogs with images, technical specifications, and certification documents, a model like this opens the door to search and recommendation systems that understand the relationship between a product image and its associated text documentation in a single query.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.